lbourdois commited on
Commit
89b275a
·
verified ·
1 Parent(s): 1cc8d0a

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +209 -197
README.md CHANGED
@@ -1,198 +1,210 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - unsloth
5
- - trl
6
- - grpo
7
- license: mit
8
- datasets:
9
- - eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
10
- language:
11
- - en
12
- base_model:
13
- - Qwen/Qwen2.5-1.5B-Instruct
14
- ---
15
-
16
- # Qwen2.5-1.5B-Instruct Fine-Tuned on GSM8K with DeepSeek Augmentation
17
-
18
- ## 🚀 Model Overview
19
-
20
- This model is a **fine-tuned version of Qwen2.5-1.5B-Instruct**, optimized for **mathematical problem-solving with step-by-step reasoning**. It was trained on the **GSM8K dataset**, incorporating **Chain-of-Thought (CoT) reasoning** using **DeepSeek augmentation**.
21
-
22
- The model is designed to provide **logical, structured, and interpretable answers**, making it ideal for applications in **education, tutoring, and automated reasoning**.
23
-
24
- ### 🔹 **Key Features**
25
- - **Base Model:** `Qwen/Qwen2.5-1.5B-Instruct`
26
- - **Fine-Tuned On:** `eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`
27
- - **Optimized for:** **Mathematical problem-solving & step-by-step logical reasoning**
28
- - **Fine-tuned with:** **LoRA (Low-Rank Adaptation) for efficient memory usage**
29
- - **Inference-ready:** Available on **🤗 Hugging Face** and **compatible with `llama.cpp`**
30
- - **Supports GGUF:** Optimized versions for **Q4_K_M, Q8_0, Q5_K_M, and FP16**
31
-
32
- ---
33
-
34
- ## 📂 **Model Details**
35
-
36
- - **Developed by:** [Your Name or Organization]
37
- - **Model Type:** Causal Language Model (**Text Generation**)
38
- - **Languages:** English (`en`)
39
- - **License:** MIT License
40
- - **Fine-tuned from:** `Qwen/Qwen2.5-1.5B-Instruct`
41
- - **Training Library:** `transformers` + `unsloth` + `trl`
42
- - **Quantization:** GGUF (`Q4_K_M, Q8_0, Q5_K_M, f16`)
43
-
44
- 🔗 **Hugging Face Repository**:
45
- 👉 [Fine-tuned Qwen2.5-1.5B-Instruct](https://huggingface.co/your-repo-id)
46
-
47
- ---
48
-
49
- ## 🛠 How to Use the Model
50
-
51
- ### **Using `transformers` in Python**
52
- You can load and use the model with 🤗 `transformers` as follows:
53
-
54
- ```python
55
- from transformers import AutoModelForCausalLM, AutoTokenizer
56
- import torch
57
-
58
- # Load model and tokenizer
59
- model_name = "your-repo-id"
60
- tokenizer = AutoTokenizer.from_pretrained(model_name)
61
- model = AutoModelForCausalLM.from_pretrained(model_name)
62
-
63
- # Move model to GPU if available
64
- device = "cuda" if torch.cuda.is_available() else "cpu"
65
- model.to(device)
66
-
67
- # Example inference
68
- question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
69
- inputs = tokenizer(question, return_tensors="pt").to(device)
70
- output = model.generate(**inputs, max_length=200)
71
-
72
- # Decode response
73
- print(tokenizer.decode(output[0], skip_special_tokens=True))
74
- ```
75
-
76
- ---
77
-
78
- ## 🖥️ Running the Model with `llama.cpp` (Mac/Linux/Windows)
79
-
80
- The model is **quantized** into GGUF format and can run on Mac **without a GPU** using `llama.cpp`.
81
-
82
- ### **1️⃣ Install `llama.cpp`**
83
- ```sh
84
- brew install llama.cpp
85
- ```
86
-
87
- ### **2️⃣ Download the Model**
88
- ```sh
89
- mkdir -p ~/llama_models && cd ~/llama_models
90
- wget https://huggingface.co/your-repo-id/resolve/main/q8_0.gguf
91
- ```
92
-
93
- ### **3️⃣ Run the Model**
94
- ```sh
95
- llama-cli -m ~/llama_models/q8_0.gguf --interactive
96
- ```
97
-
98
- Or you can use the following
99
-
100
- ```sh
101
- llama-cli -hf eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4:Q8_0
102
- ```
103
-
104
- ### **4️⃣ Test with a Prompt**
105
- ```sh
106
- llama-cli -m ~/llama_models/q8_0.gguf -p "Explain quantum computing in simple terms."
107
- ```
108
-
109
- ---
110
-
111
- ## 🏋️ **Training Details**
112
-
113
- ### **📊 Dataset Used**
114
- The model was fine-tuned on:
115
- 🔹 [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1)
116
-
117
- This dataset contains:
118
- - **8K training samples**
119
- - **1K testing samples**
120
- - Features: `"question"`, `"answer"`, `"cot"` (Chain-of-Thought)
121
-
122
- ### **⚙️ Training Configuration**
123
- - **Framework:** `transformers` + `unsloth` + `trl`
124
- - **Optimization:**
125
- - **LoRA (Low-Rank Adaptation)** applied to QKV projections
126
- - **Learning Rate:** `1e-6`
127
- - **AdamW Optimizer (8-bit)**
128
- - **Mixed Precision (`bf16` or `fp16`)**
129
- - **Batch Size:** `8`
130
- - **Gradient Accumulation Steps:** `1`
131
- - **Max Sequence Length:** `1024`
132
-
133
- ---
134
-
135
- ## 📊 **Model Performance**
136
-
137
- ### **✅ Training Loss**
138
- | Step | Training Loss | Reward | KL |
139
- |------|--------------|--------|------|
140
- | 1 | 0.0000 | 0.0000 | 0.0000 |
141
- | 500 | 0.0033 | 0.2617 | 0.0821 |
142
- | 1000 | 0.0028 | 0.1359 | 0.0696 |
143
- | 1500 | 0.0062 | 1.3781 | 0.1559 |
144
-
145
- ### **🧪 Testing & Expected Results**
146
- The model was evaluated on the **1K test samples** and showed strong accuracy in multi-step problem-solving.
147
-
148
- Example expected response:
149
- ```text
150
- To solve the problem, we first find the clips sold in May:
151
- Clips in May = 48 / 2 = 24
152
- Next, we find the total:
153
- Total Clips = 48 + 24 = 72
154
- #### Answer: 72
155
- ```
156
-
157
- ---
158
-
159
- ## 🚨 **Bias, Risks, and Limitations**
160
- ### ⚠️ **Potential Risks**
161
- - May **hallucinate** incorrect reasoning steps if prompts are unclear.
162
- - Could struggle with **complex mathematical problems** outside its training data.
163
- - **Limited generalization** to non-math reasoning tasks.
164
-
165
- ### 🎯 **Recommendations**
166
- - If using this model for **critical applications**, verify outputs with human review.
167
- - For **better performance**, fine-tune on **larger datasets** with real-world numerical reasoning.
168
-
169
- ---
170
-
171
- ## 🌍 **Environmental Impact**
172
- **Estimated Carbon Emissions:**
173
- - **Hardware Used:** NVIDIA A100 GPU
174
- - **Training Time:** ~5 hours
175
- - **Estimated CO2 Emitted:** ~8.2 kg CO2eq (via [ML Impact Calculator](https://mlco2.github.io/impact#compute))
176
-
177
- ---
178
-
179
- ## 📖 **Citation**
180
- If you use this model in your research, please cite it as:
181
-
182
- ```bibtex
183
- @misc{your_model_2024,
184
- title={Fine-Tuned Qwen2.5-1.5B-Instruct on GSM8K with DeepSeek Augmentation},
185
- author={Your Name},
186
- year={2024},
187
- url={https://huggingface.co/your-repo-id}
188
- }
189
- ```
190
-
191
- ---
192
-
193
- ## 📩 **Model Card Contact**
194
- For questions, suggestions, or issues, reach out via [Hugging Face Discussions](https://huggingface.co/your-repo-id/discussions).
195
-
196
- ---
197
-
 
 
 
 
 
 
 
 
 
 
 
 
198
  🎉 **Thank you for using this model!** If you find it useful, please ⭐ it on **Hugging Face**! 🚀🔥
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - unsloth
5
+ - trl
6
+ - grpo
7
+ license: mit
8
+ datasets:
9
+ - eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ base_model:
25
+ - Qwen/Qwen2.5-1.5B-Instruct
26
+ ---
27
+
28
+ # Qwen2.5-1.5B-Instruct Fine-Tuned on GSM8K with DeepSeek Augmentation
29
+
30
+ ## 🚀 Model Overview
31
+
32
+ This model is a **fine-tuned version of Qwen2.5-1.5B-Instruct**, optimized for **mathematical problem-solving with step-by-step reasoning**. It was trained on the **GSM8K dataset**, incorporating **Chain-of-Thought (CoT) reasoning** using **DeepSeek augmentation**.
33
+
34
+ The model is designed to provide **logical, structured, and interpretable answers**, making it ideal for applications in **education, tutoring, and automated reasoning**.
35
+
36
+ ### 🔹 **Key Features**
37
+ - **Base Model:** `Qwen/Qwen2.5-1.5B-Instruct`
38
+ - **Fine-Tuned On:** `eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`
39
+ - **Optimized for:** **Mathematical problem-solving & step-by-step logical reasoning**
40
+ - **Fine-tuned with:** **LoRA (Low-Rank Adaptation) for efficient memory usage**
41
+ - **Inference-ready:** Available on **🤗 Hugging Face** and **compatible with `llama.cpp`**
42
+ - **Supports GGUF:** Optimized versions for **Q4_K_M, Q8_0, Q5_K_M, and FP16**
43
+
44
+ ---
45
+
46
+ ## 📂 **Model Details**
47
+
48
+ - **Developed by:** [Your Name or Organization]
49
+ - **Model Type:** Causal Language Model (**Text Generation**)
50
+ - **Languages:** English (`en`)
51
+ - **License:** MIT License
52
+ - **Fine-tuned from:** `Qwen/Qwen2.5-1.5B-Instruct`
53
+ - **Training Library:** `transformers` + `unsloth` + `trl`
54
+ - **Quantization:** GGUF (`Q4_K_M, Q8_0, Q5_K_M, f16`)
55
+
56
+ 🔗 **Hugging Face Repository**:
57
+ 👉 [Fine-tuned Qwen2.5-1.5B-Instruct](https://huggingface.co/your-repo-id)
58
+
59
+ ---
60
+
61
+ ## 🛠 How to Use the Model
62
+
63
+ ### **Using `transformers` in Python**
64
+ You can load and use the model with 🤗 `transformers` as follows:
65
+
66
+ ```python
67
+ from transformers import AutoModelForCausalLM, AutoTokenizer
68
+ import torch
69
+
70
+ # Load model and tokenizer
71
+ model_name = "your-repo-id"
72
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
73
+ model = AutoModelForCausalLM.from_pretrained(model_name)
74
+
75
+ # Move model to GPU if available
76
+ device = "cuda" if torch.cuda.is_available() else "cpu"
77
+ model.to(device)
78
+
79
+ # Example inference
80
+ question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
81
+ inputs = tokenizer(question, return_tensors="pt").to(device)
82
+ output = model.generate(**inputs, max_length=200)
83
+
84
+ # Decode response
85
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
86
+ ```
87
+
88
+ ---
89
+
90
+ ## 🖥️ Running the Model with `llama.cpp` (Mac/Linux/Windows)
91
+
92
+ The model is **quantized** into GGUF format and can run on Mac **without a GPU** using `llama.cpp`.
93
+
94
+ ### **1️⃣ Install `llama.cpp`**
95
+ ```sh
96
+ brew install llama.cpp
97
+ ```
98
+
99
+ ### **2️⃣ Download the Model**
100
+ ```sh
101
+ mkdir -p ~/llama_models && cd ~/llama_models
102
+ wget https://huggingface.co/your-repo-id/resolve/main/q8_0.gguf
103
+ ```
104
+
105
+ ### **3️⃣ Run the Model**
106
+ ```sh
107
+ llama-cli -m ~/llama_models/q8_0.gguf --interactive
108
+ ```
109
+
110
+ Or you can use the following
111
+
112
+ ```sh
113
+ llama-cli -hf eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4:Q8_0
114
+ ```
115
+
116
+ ### **4️⃣ Test with a Prompt**
117
+ ```sh
118
+ llama-cli -m ~/llama_models/q8_0.gguf -p "Explain quantum computing in simple terms."
119
+ ```
120
+
121
+ ---
122
+
123
+ ## 🏋️ **Training Details**
124
+
125
+ ### **📊 Dataset Used**
126
+ The model was fine-tuned on:
127
+ 🔹 [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1)
128
+
129
+ This dataset contains:
130
+ - **8K training samples**
131
+ - **1K testing samples**
132
+ - Features: `"question"`, `"answer"`, `"cot"` (Chain-of-Thought)
133
+
134
+ ### **⚙️ Training Configuration**
135
+ - **Framework:** `transformers` + `unsloth` + `trl`
136
+ - **Optimization:**
137
+ - **LoRA (Low-Rank Adaptation)** applied to QKV projections
138
+ - **Learning Rate:** `1e-6`
139
+ - **AdamW Optimizer (8-bit)**
140
+ - **Mixed Precision (`bf16` or `fp16`)**
141
+ - **Batch Size:** `8`
142
+ - **Gradient Accumulation Steps:** `1`
143
+ - **Max Sequence Length:** `1024`
144
+
145
+ ---
146
+
147
+ ## 📊 **Model Performance**
148
+
149
+ ### **✅ Training Loss**
150
+ | Step | Training Loss | Reward | KL |
151
+ |------|--------------|--------|------|
152
+ | 1 | 0.0000 | 0.0000 | 0.0000 |
153
+ | 500 | 0.0033 | 0.2617 | 0.0821 |
154
+ | 1000 | 0.0028 | 0.1359 | 0.0696 |
155
+ | 1500 | 0.0062 | 1.3781 | 0.1559 |
156
+
157
+ ### **🧪 Testing & Expected Results**
158
+ The model was evaluated on the **1K test samples** and showed strong accuracy in multi-step problem-solving.
159
+
160
+ Example expected response:
161
+ ```text
162
+ To solve the problem, we first find the clips sold in May:
163
+ Clips in May = 48 / 2 = 24
164
+ Next, we find the total:
165
+ Total Clips = 48 + 24 = 72
166
+ #### Answer: 72
167
+ ```
168
+
169
+ ---
170
+
171
+ ## 🚨 **Bias, Risks, and Limitations**
172
+ ### ⚠️ **Potential Risks**
173
+ - May **hallucinate** incorrect reasoning steps if prompts are unclear.
174
+ - Could struggle with **complex mathematical problems** outside its training data.
175
+ - **Limited generalization** to non-math reasoning tasks.
176
+
177
+ ### 🎯 **Recommendations**
178
+ - If using this model for **critical applications**, verify outputs with human review.
179
+ - For **better performance**, fine-tune on **larger datasets** with real-world numerical reasoning.
180
+
181
+ ---
182
+
183
+ ## 🌍 **Environmental Impact**
184
+ **Estimated Carbon Emissions:**
185
+ - **Hardware Used:** NVIDIA A100 GPU
186
+ - **Training Time:** ~5 hours
187
+ - **Estimated CO2 Emitted:** ~8.2 kg CO2eq (via [ML Impact Calculator](https://mlco2.github.io/impact#compute))
188
+
189
+ ---
190
+
191
+ ## 📖 **Citation**
192
+ If you use this model in your research, please cite it as:
193
+
194
+ ```bibtex
195
+ @misc{your_model_2024,
196
+ title={Fine-Tuned Qwen2.5-1.5B-Instruct on GSM8K with DeepSeek Augmentation},
197
+ author={Your Name},
198
+ year={2024},
199
+ url={https://huggingface.co/your-repo-id}
200
+ }
201
+ ```
202
+
203
+ ---
204
+
205
+ ## 📩 **Model Card Contact**
206
+ For questions, suggestions, or issues, reach out via [Hugging Face Discussions](https://huggingface.co/your-repo-id/discussions).
207
+
208
+ ---
209
+
210
  🎉 **Thank you for using this model!** If you find it useful, please ⭐ it on **Hugging Face**! 🚀🔥