Add/update the quantized ONNX model files and README.md for Transformers.js v3
#4
by
whitphx
HF Staff
- opened
Applied Quantizations
✅ Based on decoder_model_merged.onnx with slimming
↳ ✅ fp16: decoder_model_merged_fp16.onnx (replaced because it was invalid)
↳ ✅ int8: decoder_model_merged_int8.onnx (added)
↳ ✅ uint8: decoder_model_merged_uint8.onnx (added)
↳ ✅ q4: decoder_model_merged_q4.onnx (added)
↳ ✅ q4f16: decoder_model_merged_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_merged_bnb4.onnx (added)
Xenova
changed pull request status to
merged