sebastavar commited on
Commit
6c21a41
·
verified ·
1 Parent(s): 9ffbae6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -70,8 +70,10 @@ Perplexity (PPL) streaming evaluation on WikiText-2 (raw, test); fast preset wit
70
  |----------------------|-----------------------|
71
  | MLX 8-bit (gs=32) | 7.39 |
72
  | MLX bf16 (reference) | 7.38 |
 
73
 
74
  Notes:
 
75
  - Results from local runs on Apple Silicon using MLX; numbers vary slightly with tokenizer details, logits dtype, and token subset.
76
  - For more sensitive comparisons, use overlapping windows (e.g., `--stride 512`) and evaluate the full split.
77
 
@@ -89,6 +91,7 @@ python -m mlx_lm convert \
89
  ## Sibling & reference models
90
 
91
  - halley-ai/gpt-oss-120b-MLX-bf16 (non-quantized reference)
 
92
 
93
  ## Limitations & biases
94
 
 
70
  |----------------------|-----------------------|
71
  | MLX 8-bit (gs=32) | 7.39 |
72
  | MLX bf16 (reference) | 7.38 |
73
+ | MLX 6-bit (gs=64) | 7.40 |
74
 
75
  Notes:
76
+
77
  - Results from local runs on Apple Silicon using MLX; numbers vary slightly with tokenizer details, logits dtype, and token subset.
78
  - For more sensitive comparisons, use overlapping windows (e.g., `--stride 512`) and evaluate the full split.
79
 
 
91
  ## Sibling & reference models
92
 
93
  - halley-ai/gpt-oss-120b-MLX-bf16 (non-quantized reference)
94
+ - halley-ai/gpt-oss-120b-MLX-6bit-gs64 (smaller/faster variant)
95
 
96
  ## Limitations & biases
97