Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ✅ Based on `decoder_model_merged.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ ✅ `int8`: `decoder_model_merged_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_merged_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_merged_bnb4.onnx` (added)
onnx/decoder_model_merged_bnb4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9ac868811f3c1d734e009beb092a2fcea9c60ab4bbb46340dc2e6ee175652532
|
3 |
+
size 89735909
|
onnx/decoder_model_merged_fp16.onnx
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:701738e78b66137e6237dccecfe23d76a7e4e898bf55f16d792df2d0f7b8fd80
|
3 |
+
size 116622030
|
onnx/decoder_model_merged_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:66474c5e6fd2899557541cfd6207091ca9b729418097e154a03645f420b2c0c8
|
3 |
+
size 58766935
|
onnx/decoder_model_merged_q4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5c485fedf6a4984362a85c114b0f7af8f1935be44f1f287885bf2532f05b6e69
|
3 |
+
size 92335654
|
onnx/decoder_model_merged_q4f16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ceacdfd6609d41c5006198576a6b794097283598d5e0a1d7bb31e324658b767f
|
3 |
+
size 56823126
|
onnx/decoder_model_merged_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:74826be1b4d9f21d579cc00342bb3538480370cd5b457ec5f76b62c10442e349
|
3 |
+
size 58766988
|