Error exporting whisper-large-v3-turbo to onnx with optimum

#58
by rayli1107 - opened

I'm trying to export whisper-large-v3-turbo to onnx using optimum, and got the following error. Does anyone know what it means?

optimum-cli export onnx --model openai/whisper-large-v3-turbo whisper_onnx
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/models/whisper/modeling_whisper.py:1017: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_features.shape[-1] != expected_seq_length:
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/models/whisper/modeling_whisper.py:334: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz, self.num_heads, tgt_len, self.head_dim):
Passing a tuple of past_key_values is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of EncoderDecoderCache instead, e.g. past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values).
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/models/whisper/modeling_whisper.py:1477: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if sequence_length != 1:
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/cache_utils.py:458: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
or len(self.key_cache[layer_idx]) == 0 # the layer has no cache
/home/rayli/optimum-venv/lib/python3.12/site-packages/transformers/cache_utils.py:443: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
elif len(self.key_cache[layer_idx]) == 0: # fills previously skipped layers; checking for tensor causes errors
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
proj_out.weight: {'onnx::MatMul_1575'}
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
proj_out.weight: {'onnx::MatMul_1458'}
Killed

Thanks,
Ray

same here on llama

Official availability

OpenAI's Whisper models, including whisper-large-v3-turbo, are not officially released as ONNX models by OpenAI.

Turbo versions are proprietary and optimized for OpenAI's infrastructure; the weights are not publicly released, only accessible via API.

Community conversions

For public Whisper versions (v1-v3), community developers have exported them to ONNX for local inference (e.g., Whisper-small, medium, large-v3 non-turbo).

GitHub projects like Whisper-ONNX support exporting public Whisper models from Hugging Face Transformers to ONNX for CPU/GPU/DirectML inference.

Turbo models

Whisper-Turbo models are not open source. They are optimized, smaller/faster variants hosted only within OpenAI’s API infrastructure.

You cannot export them to ONNX because you do not have access to their underlying weights.

Sign up or log in to comment