PursuitOfDataScience
/

Qwen2.5-1.5B-Instruct-Lora-Deepseek-R1

Model card Files Files and versions

PursuitOfDataScience commited on Mar 22

Commit

72d89a9

·

verified ·

1 Parent(s): 448330c

Update README.md

added evaluation metric.

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -26,6 +26,26 @@ This model is a LoRA (Low-Rank Adaptation) fine-tuned version of **Qwen2.5-1.5B-
 ---
 ## How to Use
 ### Example Python Script

 ---
+## Evaluation on MATH-500 Benchmark
+After following the sampling-based Pass@1 methodology inspired by [DeepSeek R1](https://arxiv.org/abs/2501.12948), we have
+| Parameter        | Value   |
+|------------------|---------|
+| **Dataset**      | `uggingFaceH4/MATH-500` |
+| **Temperature**  | `0.6`   |
+| **Top_p**        | `0.95`  |
+| **Num_samples**  | `16` per question |
+### Results
+- **At-least-one-correct Rate:** **54.60%** (273 out of 500 questions)
+*This metric represents the percentage of questions with at least one correct solution among multiple generated attempts.*
+---
 ## How to Use
 ### Example Python Script