openai/whisper-large-v3-turbo · Error exporting whisper-large-v3-turbo to onnx with optimum

Jan 27

I'm trying to export whisper-large-v3-turbo to onnx using optimum, and got the following error. Does anyone know what it means?

optimum-cli export onnx --model openai/whisper-large-v3-turbo whisper_onnx
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/models/whisper/modeling_whisper.py:1017: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_features.shape[-1] != expected_seq_length:
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/models/whisper/modeling_whisper.py:334: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz, self.num_heads, tgt_len, self.head_dim):
Passing a tuple of past_key_values is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of EncoderDecoderCache instead, e.g. past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values).
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/models/whisper/modeling_whisper.py:1477: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if sequence_length != 1:
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/cache_utils.py:458: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
or len(self.key_cache[layer_idx]) == 0 # the layer has no cache
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/cache_utils.py:443: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
elif len(self.key_cache[layer_idx]) == 0: # fills previously skipped layers; checking for tensor causes errors
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
proj_out.weight: {'onnx::MatMul_1575'}
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
proj_out.weight: {'onnx::MatMul_1458'}
Killed

Thanks,
Ray

Ravikj

Mar 23

same here on llama

Chaithra26

Jul 8

Official availability

OpenAI's Whisper models, including whisper-large-v3-turbo, are not officially released as ONNX models by OpenAI.

Turbo versions are proprietary and optimized for OpenAI's infrastructure; the weights are not publicly released, only accessible via API.

Community conversions

For public Whisper versions (v1-v3), community developers have exported them to ONNX for local inference (e.g., Whisper-small, medium, large-v3 non-turbo).

GitHub projects like Whisper-ONNX support exporting public Whisper models from Hugging Face Transformers to ONNX for CPU/GPU/DirectML inference.

Turbo models

Whisper-Turbo models are not open source. They are optimized, smaller/faster variants hosted only within OpenAI’s API infrastructure.

You cannot export them to ONNX because you do not have access to their underlying weights.