zamroni111's picture
Update README.md
bc78057 verified
|
raw
history blame
3.01 kB
---
language:
- en
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
pipeline_tag: text-generation
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
## Model Details
meta-llama/Meta-Llama-3.1-8B-Instruct quantized to ONNX GenAI INT4 with Microsoft DirectML optimization
### Model Description
meta-llama/Meta-Llama-3.1-8B-Instruct quantized to ONNX GenAI INT4 with Microsoft DirectML optimization<br>
https://onnxruntime.ai/docs/genai/howto/install.html#directml
Created using ONNX Runtime GenAI's builder.py<br>
https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/src/python/py/models/builder.py
INT4 accuracy level: FP32 (float32)<br>
8-bit quantization for MoE layers
- **Developed by:** Mochamad Aris Zamroni
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]
### Model Sources [optional]
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
This is Windows DirectML optimized model.
Prerequisites:<br>
1. Install Python 3.10 from Windows Store:<br>
https://apps.microsoft.com/detail/9pjpw5ldxlz5?hl=en-us&gl=US
2. Open command line cmd.exe
3. Create python virtual environment and install onnxruntime-genai-directml<br>
mkdir c:\temp<br>
cd c:\temp<br>
python -m venv dmlgenai<br>
dmlgenai\Scripts\activate.bat<br>
pip install onnxruntime-genai-directml
## How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
#### Preprocessing [optional]
[More Information Needed]
#### Speeds, Sizes, Times [optional]
15 token/s in Radeon 780M with 8GB dedicated RAM
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
[More Information Needed]
### Results
[More Information Needed]
#### Summary
## Model Examination [optional]
<!-- Relevant interpretability work for the model goes here -->
[More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
[More Information Needed]
### Compute Infrastructure
Microsoft Windows DirectML
#### Hardware
AMD Ryzen 7840U with integrated Radeon 780M GPU
RAM 32GB
shared VRAM 8GB
#### Software
Microsoft Windows DirectML
## Model Card Authors [optional]
Mochamad Aris Zamroni
## Model Card Contact
https://www.linkedin.com/in/zamroni/