linzhao-amd commited on
Commit
5b1d7fe
·
verified ·
1 Parent(s): eb42dad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -11,6 +11,9 @@ base_model:
11
  - **Model Name**: Mixtral-7x8b
12
  - **Version**: MLPerf v5.1
13
  - **Commit**: Close Division Commit
 
 
 
14
 
15
  ## Calibration Dataset
16
  The calibration dataset consists of **1024 mixed datasets** provided by MLPerf, which includes:
@@ -30,6 +33,29 @@ The following layers are ignored during quantization:
30
  - `*.o_proj`
31
  - `lm_head`
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  # Model Performance Comparison
34
 
35
  | Metric | Baseline Accuracy Target (%) | FP8 Quant Accuracy (%) |
 
11
  - **Model Name**: Mixtral-7x8b
12
  - **Version**: MLPerf v5.1
13
  - **Commit**: Close Division Commit
14
+ - **Supported Hardware Microarchitecture**: AMD MI300/MI325
15
+ - **Transformers**: 4.51.0
16
+ - **Quark:** [0.9](https://quark.docs.amd.com/latest/install.html)
17
 
18
  ## Calibration Dataset
19
  The calibration dataset consists of **1024 mixed datasets** provided by MLPerf, which includes:
 
33
  - `*.o_proj`
34
  - `lm_head`
35
 
36
+ ## Quantization Scripts
37
+ ```
38
+ cd examples/torch/language_modeling/llm_ptq/
39
+ MODEL_DIR="mistralai/Mixtral-8x7B-Instruct-v0.1"
40
+ DATASET="./mlperf_data/mixtral_8x7b%2F2024.06.06_mixtral_15k_calibration_v4.pkl"
41
+ OUTPUT_DIR="quantized_models/Mixtral-8x7B-Instruct-v0.1_FP8_MLPerf"
42
+
43
+ python3 quantize_quark.py --model_dir "${MODEL}" \
44
+ --output_dir "${OUTPUT_DIR}" \
45
+ --dataset "${DATASET}" \
46
+ --data_type float16 \
47
+ --multi_gpu \
48
+ --quant_scheme w_fp8_a_fp8 \
49
+ --kv_cache_dtype fp8 \
50
+ --num_calib_data 1024 \
51
+ --seq_len 1024 \
52
+ --min_kv_scale 1.0 \
53
+ --model_export hf_format \
54
+ --custom_mode fp8 \
55
+ --quant_algo autosmoothquant \
56
+ --exclude_layers "lm_head" "*.gate"
57
+ ```
58
+
59
  # Model Performance Comparison
60
 
61
  | Metric | Baseline Accuracy Target (%) | FP8 Quant Accuracy (%) |