Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,14 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
pipeline_tag: text-generation
|
4 |
+
tags: [ONNX, ONNXRuntime, phi3.5, nlp, conversational, custom_code]
|
5 |
+
inference: false
|
6 |
+
---
|
7 |
+
|
8 |
+
Based on https://huggingface.co/microsoft/Phi-4-mini-instruct
|
9 |
+
|
10 |
+
Convert ONNX model by using https://github.com/microsoft/onnxruntime-genai
|
11 |
+
|
12 |
+
Using command: python -m onnxruntime_genai.models.builder -m microsoft/Phi-4-mini-instruct -o Phi-4-mini-instruct-onnx -e webgpu -c cache-dir -p int4 --extra_options int4_block_size=32 int4_accuracy_level=4
|
13 |
+
|
14 |
+
The generated external data (model.onnx.data) is larger than 2GB, which is not suitable for ORT-Web. I use an additional Python script to move some data into model.onnx.
|