ibelem commited on
Commit
be34995
·
verified ·
1 Parent(s): 5e9ac47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -3
README.md CHANGED
@@ -1,3 +1,14 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: text-generation
4
+ tags: [ONNX, ONNXRuntime, phi3.5, nlp, conversational, custom_code]
5
+ inference: false
6
+ ---
7
+
8
+ Based on https://huggingface.co/microsoft/Phi-4-mini-instruct
9
+
10
+ Convert ONNX model by using https://github.com/microsoft/onnxruntime-genai
11
+
12
+ Using command: python -m onnxruntime_genai.models.builder -m microsoft/Phi-4-mini-instruct -o Phi-4-mini-instruct-onnx -e webgpu -c cache-dir -p int4 --extra_options int4_block_size=32 int4_accuracy_level=4
13
+
14
+ The generated external data (model.onnx.data) is larger than 2GB, which is not suitable for ORT-Web. I use an additional Python script to move some data into model.onnx.