File size: 536 Bytes
7f784ab
 
 
 
44b45e9
7f784ab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
base_model:
- meta-llama/Llama-3.3-70B-Instruct
base_model_relation: quantized
license: llama3.3
---
# Model Card

- Base model: `meta-llama/Llama-3.3-70B-Instruct`
- Quantization method: SqueezeLLM
- Target bit-width: 3
- Backend kernel: Any-Precision-LLM kernel (`ap-gemv`)
- Calibration data: RedPajama (1024 sentences / 4096 tokens)
- Calibration objective: Next-token prediction


# How to run
- Follow the instruction in https://github.com/snu-mllab/GuidedQuant.

# References
- [Model Paper](https://arxiv.org/abs/2505.07004)