|
--- |
|
base_model: |
|
- meta-llama/Llama-3.3-70B-Instruct |
|
base_model_relation: quantized |
|
license: llama3.3 |
|
--- |
|
# Model Card |
|
|
|
- Base model: `meta-llama/Llama-3.3-70B-Instruct` |
|
- Quantization method: SqueezeLLM |
|
- Target bit-width: 2 |
|
- Backend kernel: Any-Precision-LLM kernel (`ap-gemv`) |
|
- Calibration data: RedPajama (1024 sentences / 4096 tokens) |
|
- Calibration objective: Next-token prediction |
|
- num_groups (for GuidedQuant Hessian): 1 |
|
|
|
# How to run |
|
- Follow the instruction in https://github.com/snu-mllab/GuidedQuant. |