RedHatAI
/

Devstral-Small-2507-FP8-Dynamic

Text Generation

compressed-tensors

Model card Files Files and versions

ekurtic commited on 27 days ago

Commit

42a06be

·

verified ·

1 Parent(s): 4cd653b

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +11 -5

README.md CHANGED Viewed

@@ -54,15 +54,21 @@ vllm serve RedHatAI/Devstral-Small-2507-FP8-Dynamic --tensor-parallel-size 1 --t
 ## Evaluation
 The model was evaluated on popular coding tasks (HumanEval, HumanEval+, MBPP, MBPP+) via [EvalPlus](https://github.com/evalplus/evalplus) and vllm backend (v0.10.1.1).
-For evaluations, we run greedy sampling and report pass@1
 ### Accuracy
 |                             | Recovery (%) | mistralai/Devstral-Small-2507 | RedHatAI/Devstral-Small-2507-FP8-Dynamic<br>(this model) |
 | --------------------------- | :----------: | :------------------: | :--------------------------------------------------: |
-| HumanEval                   | 98.50        | 89.0                | 89.6                                                |
-| HumanEval+                  | 99.88        | 81.1                | 82.9                                                |
-| MBPP                        | 101.21       | 77.5                | 75.4                                                |
-| MBPP+                       | 101.21       | 66.1                | 64.8                                                |
 | **Average Score**           | **99.68**    | **78.43**            | **78.18**                                            |

 ## Evaluation
 The model was evaluated on popular coding tasks (HumanEval, HumanEval+, MBPP, MBPP+) via [EvalPlus](https://github.com/evalplus/evalplus) and vllm backend (v0.10.1.1).
+For evaluations, we run greedy sampling and report pass@1. The command to reproduce evals:
+```bash
+evalplus.evaluate --model "RedHatAI/Devstral-Small-2507-FP8-Dynamic" \
+                  --dataset [humaneval|mbpp] \
+                  --base-url http://localhost:8000/v1 \
+                  --backend openai --greedy
+```
 ### Accuracy
 |                             | Recovery (%) | mistralai/Devstral-Small-2507 | RedHatAI/Devstral-Small-2507-FP8-Dynamic<br>(this model) |
 | --------------------------- | :----------: | :------------------: | :--------------------------------------------------: |
+| HumanEval                   | 100.67        | 89.0                | 89.6                                                |
+| HumanEval+                  | 102.22        | 81.1                | 82.9                                                |
+| MBPP                        | 97.29       | 77.5                | 75.4                                                |
+| MBPP+                       | 98.03       | 66.1                | 64.8                                                |
 | **Average Score**           | **99.68**    | **78.43**            | **78.18**                                            |