Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -29,6 +29,11 @@ On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66049fc71116cebd1d3bdcf4/521rXwIlYS9HIAEBAPJjw.png)
 # Quantization Recipe
 First need to install the required packages:
@@ -213,7 +218,7 @@ We can run the quantized model on a mobile phone using [ExecuTorch](https://gith
 Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.html), exporting and running the model on device is a breeze.
 ExecuTorch's LLM export scripts require the checkpoint keys and parameters have certain names, which differ from those used in Hugging Face.
-So we first use a conversion script that converts the Hugging Face checkpoint key names to ones that ExecuTorch expects:
 ```Shell
 python -m executorch.examples.models.phi_4_mini.convert_weights $(hf download pytorch/Phi-4-mini-instruct-INT8-INT4) pytorch_model_converted.bin
 ```

 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66049fc71116cebd1d3bdcf4/521rXwIlYS9HIAEBAPJjw.png)
+⚠️ **Caveat:** Our mobile demo apps have **regressed support for the Phi-4 tokenizer**, so this model will not currently run in our official demo apps.
+If you are using your own runner, you can still load and run the `.pte` file successfully.
+(See https://github.com/pytorch/executorch/issues/14077 for details and tracking.)
 # Quantization Recipe
 First need to install the required packages:
 Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.html), exporting and running the model on device is a breeze.
 ExecuTorch's LLM export scripts require the checkpoint keys and parameters have certain names, which differ from those used in Hugging Face.
+So we first use a script that converts the Hugging Face checkpoint key names to ones that ExecuTorch expects:
 ```Shell
 python -m executorch.examples.models.phi_4_mini.convert_weights $(hf download pytorch/Phi-4-mini-instruct-INT8-INT4) pytorch_model_converted.bin
 ```