Files changed (1) hide show
  1. README.md +175 -163
README.md CHANGED
@@ -1,163 +1,175 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - unsloth
5
- - trl
6
- - grpo
7
- license: mit
8
- datasets:
9
- - eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
10
- language:
11
- - en
12
- base_model:
13
- - Qwen/Qwen2.5-1.5B-Instruct
14
- ---
15
-
16
- # Qwen2.5-1.5B-Instruct Fine-Tuned on GSM8K with DeepSeek Augmentation
17
-
18
- ## Model Overview
19
-
20
- This model is a fine-tuned version of **Qwen2.5-1.5B-Instruct**, designed for **mathematical problem-solving and structured reasoning**. It is trained on an **enhanced GSM8K dataset** incorporating **Chain-of-Thought (CoT) reasoning** augmented by **DeepSeek AI**.
21
-
22
- ### Key Features
23
- - **Base Model:** Qwen2.5-1.5B-Instruct
24
- - **Fine-Tuned On:** GSM8K enhanced with DeepSeek-V3
25
- - **Optimized for:** Logical problem-solving and math reasoning
26
- - **Fine-tuning method:** LoRA (Low-Rank Adaptation)
27
- - **Inference-ready:** Available on **Hugging Face** and compatible with `llama.cpp`
28
- - **Supports GGUF:** Optimized versions for **Q4_K_M, Q8_0, Q5_K_M, and FP16**
29
-
30
- ## Model Details
31
-
32
- - **Developed by:** [Your Name or Organization]
33
- - **Model Type:** Causal Language Model (Text Generation)
34
- - **Languages:** English (`en`)
35
- - **License:** MIT License
36
- - **Fine-tuned from:** `Qwen/Qwen2.5-1.5B-Instruct`
37
- - **Training Library:** `transformers` + `unsloth` + `trl`
38
- - **Quantization:** GGUF (`Q4_K_M, Q8_0, Q5_K_M, f16`)
39
-
40
- 🔗 **Hugging Face Repository:**
41
- 👉 [Fine-tuned Qwen2.5-1.5B-Instruct](https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1)
42
-
43
- ## How to Use the Model
44
-
45
- ### Using `transformers` in Python
46
- ```python
47
- from transformers import AutoModelForCausalLM, AutoTokenizer
48
- import torch
49
-
50
- # Load model and tokenizer
51
- model_name = "eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v"
52
- tokenizer = AutoTokenizer.from_pretrained(model_name)
53
- model = AutoModelForCausalLM.from_pretrained(model_name)
54
-
55
- # Move model to GPU if available
56
- device = "cuda" if torch.cuda.is_available() else "cpu"
57
- model.to(device)
58
-
59
- # Example inference
60
- question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
61
- inputs = tokenizer(question, return_tensors="pt").to(device)
62
- output = model.generate(**inputs, max_length=200)
63
-
64
- # Decode response
65
- print(tokenizer.decode(output[0], skip_special_tokens=True))
66
- ```
67
-
68
- ## Running the Model with `llama.cpp`
69
-
70
- ### Step 1: Install `llama.cpp`
71
- ```sh
72
- brew install llama.cpp
73
- ```
74
-
75
- ### Step 2: Download the Model
76
- ```sh
77
- mkdir -p ~/llama_models && cd ~/llama_models
78
- wget https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1/resolve/main/q8_0.gguf
79
- ```
80
-
81
- ### Step 3: Run the Model
82
- ```sh
83
- llama-cli -m ~/llama_models/q8_0.gguf --interactive
84
- ```
85
-
86
- Or you can use the following:
87
-
88
- ```sh
89
- llama-cli -hf eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-gguf-data-enhanced-with-deepseek-v3-small:Q8_0
90
- ```
91
-
92
- ### Step 4: Test with a Prompt
93
- ```sh
94
- llama-cli -m ~/llama_models/q8_0.gguf -p "Explain quantum computing in simple terms."
95
- ```
96
-
97
- ## Training Details
98
-
99
- ### Dataset Used
100
- The model was fine-tuned on:
101
- 🔹 [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1)
102
-
103
- This dataset contains:
104
- - **8K training samples**
105
- - **1K testing samples**
106
- - Features: `question`, `answer`, `cot` (Chain-of-Thought)
107
-
108
- ### Training Configuration
109
- - **Framework:** `transformers` + `unsloth` + `trl`
110
- - **Optimization:** LoRA applied to QKV projections
111
- - **Learning Rate:** `1e-6`
112
- - **AdamW Optimizer (8-bit)**
113
- - **Mixed Precision (`bf16` or `fp16`)**
114
- - **Batch Size:** `8`
115
- - **Max Sequence Length:** `1024`
116
-
117
- ## Model Performance
118
-
119
- ### Training Loss
120
- | Step | Training Loss |
121
- |------|--------------|
122
- | 10 | 1.1335 |
123
- | 100 | 0.9770 |
124
- | 3100 | 0.1722 |
125
- | 9340 | 0.1553 |
126
-
127
- ## Bias, Risks, and Limitations
128
-
129
- ### Potential Risks
130
- - May **hallucinate** incorrect reasoning steps if prompts are unclear.
131
- - Could struggle with **complex mathematical problems** outside its training data.
132
- - **Limited generalization** to non-math reasoning tasks.
133
-
134
- ### Recommendations
135
- - If using this model for **critical applications**, verify outputs with human review.
136
- - For **better performance**, fine-tune on **larger datasets** with real-world numerical reasoning.
137
-
138
- ## Environmental Impact
139
-
140
- **Estimated Carbon Emissions:**
141
- - **Hardware Used:** NVIDIA A100 GPU
142
- - **Training Time:** ~5 hours
143
- - **Estimated CO2 Emitted:** ~8.2 kg CO2eq (via [ML Impact Calculator](https://mlco2.github.io/impact#compute))
144
-
145
- ## Citation
146
-
147
- If you use this model in your research, please cite it as:
148
- ```bibtex
149
- @misc{coming,
150
- title={Fine-Tuned Qwen2.5-1.5B-Instruct on GSM8K with DeepSeek Augmentation},
151
- author={Your Name},
152
- year={2024},
153
- url={https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1}
154
- }
155
- ```
156
-
157
- ## Contact
158
- For questions, suggestions, or issues, reach out via [Hugging Face Discussions](https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1/discussions).
159
-
160
- ---
161
-
162
- 🎉 **Thank you for using this model!** If you find it useful, please ⭐ it on **Hugging Face**! 🚀🔥
163
-
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - unsloth
5
+ - trl
6
+ - grpo
7
+ license: mit
8
+ datasets:
9
+ - eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ base_model:
25
+ - Qwen/Qwen2.5-1.5B-Instruct
26
+ ---
27
+
28
+ # Qwen2.5-1.5B-Instruct Fine-Tuned on GSM8K with DeepSeek Augmentation
29
+
30
+ ## Model Overview
31
+
32
+ This model is a fine-tuned version of **Qwen2.5-1.5B-Instruct**, designed for **mathematical problem-solving and structured reasoning**. It is trained on an **enhanced GSM8K dataset** incorporating **Chain-of-Thought (CoT) reasoning** augmented by **DeepSeek AI**.
33
+
34
+ ### Key Features
35
+ - **Base Model:** Qwen2.5-1.5B-Instruct
36
+ - **Fine-Tuned On:** GSM8K enhanced with DeepSeek-V3
37
+ - **Optimized for:** Logical problem-solving and math reasoning
38
+ - **Fine-tuning method:** LoRA (Low-Rank Adaptation)
39
+ - **Inference-ready:** Available on **Hugging Face** and compatible with `llama.cpp`
40
+ - **Supports GGUF:** Optimized versions for **Q4_K_M, Q8_0, Q5_K_M, and FP16**
41
+
42
+ ## Model Details
43
+
44
+ - **Developed by:** [Your Name or Organization]
45
+ - **Model Type:** Causal Language Model (Text Generation)
46
+ - **Languages:** English (`en`)
47
+ - **License:** MIT License
48
+ - **Fine-tuned from:** `Qwen/Qwen2.5-1.5B-Instruct`
49
+ - **Training Library:** `transformers` + `unsloth` + `trl`
50
+ - **Quantization:** GGUF (`Q4_K_M, Q8_0, Q5_K_M, f16`)
51
+
52
+ 🔗 **Hugging Face Repository:**
53
+ 👉 [Fine-tuned Qwen2.5-1.5B-Instruct](https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1)
54
+
55
+ ## How to Use the Model
56
+
57
+ ### Using `transformers` in Python
58
+ ```python
59
+ from transformers import AutoModelForCausalLM, AutoTokenizer
60
+ import torch
61
+
62
+ # Load model and tokenizer
63
+ model_name = "eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v"
64
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
65
+ model = AutoModelForCausalLM.from_pretrained(model_name)
66
+
67
+ # Move model to GPU if available
68
+ device = "cuda" if torch.cuda.is_available() else "cpu"
69
+ model.to(device)
70
+
71
+ # Example inference
72
+ question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
73
+ inputs = tokenizer(question, return_tensors="pt").to(device)
74
+ output = model.generate(**inputs, max_length=200)
75
+
76
+ # Decode response
77
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
78
+ ```
79
+
80
+ ## Running the Model with `llama.cpp`
81
+
82
+ ### Step 1: Install `llama.cpp`
83
+ ```sh
84
+ brew install llama.cpp
85
+ ```
86
+
87
+ ### Step 2: Download the Model
88
+ ```sh
89
+ mkdir -p ~/llama_models && cd ~/llama_models
90
+ wget https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1/resolve/main/q8_0.gguf
91
+ ```
92
+
93
+ ### Step 3: Run the Model
94
+ ```sh
95
+ llama-cli -m ~/llama_models/q8_0.gguf --interactive
96
+ ```
97
+
98
+ Or you can use the following:
99
+
100
+ ```sh
101
+ llama-cli -hf eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-gguf-data-enhanced-with-deepseek-v3-small:Q8_0
102
+ ```
103
+
104
+ ### Step 4: Test with a Prompt
105
+ ```sh
106
+ llama-cli -m ~/llama_models/q8_0.gguf -p "Explain quantum computing in simple terms."
107
+ ```
108
+
109
+ ## Training Details
110
+
111
+ ### Dataset Used
112
+ The model was fine-tuned on:
113
+ 🔹 [`eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1`](https://huggingface.co/datasets/eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1)
114
+
115
+ This dataset contains:
116
+ - **8K training samples**
117
+ - **1K testing samples**
118
+ - Features: `question`, `answer`, `cot` (Chain-of-Thought)
119
+
120
+ ### Training Configuration
121
+ - **Framework:** `transformers` + `unsloth` + `trl`
122
+ - **Optimization:** LoRA applied to QKV projections
123
+ - **Learning Rate:** `1e-6`
124
+ - **AdamW Optimizer (8-bit)**
125
+ - **Mixed Precision (`bf16` or `fp16`)**
126
+ - **Batch Size:** `8`
127
+ - **Max Sequence Length:** `1024`
128
+
129
+ ## Model Performance
130
+
131
+ ### Training Loss
132
+ | Step | Training Loss |
133
+ |------|--------------|
134
+ | 10 | 1.1335 |
135
+ | 100 | 0.9770 |
136
+ | 3100 | 0.1722 |
137
+ | 9340 | 0.1553 |
138
+
139
+ ## Bias, Risks, and Limitations
140
+
141
+ ### Potential Risks
142
+ - May **hallucinate** incorrect reasoning steps if prompts are unclear.
143
+ - Could struggle with **complex mathematical problems** outside its training data.
144
+ - **Limited generalization** to non-math reasoning tasks.
145
+
146
+ ### Recommendations
147
+ - If using this model for **critical applications**, verify outputs with human review.
148
+ - For **better performance**, fine-tune on **larger datasets** with real-world numerical reasoning.
149
+
150
+ ## Environmental Impact
151
+
152
+ **Estimated Carbon Emissions:**
153
+ - **Hardware Used:** NVIDIA A100 GPU
154
+ - **Training Time:** ~5 hours
155
+ - **Estimated CO2 Emitted:** ~8.2 kg CO2eq (via [ML Impact Calculator](https://mlco2.github.io/impact#compute))
156
+
157
+ ## Citation
158
+
159
+ If you use this model in your research, please cite it as:
160
+ ```bibtex
161
+ @misc{coming,
162
+ title={Fine-Tuned Qwen2.5-1.5B-Instruct on GSM8K with DeepSeek Augmentation},
163
+ author={Your Name},
164
+ year={2024},
165
+ url={https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1}
166
+ }
167
+ ```
168
+
169
+ ## Contact
170
+ For questions, suggestions, or issues, reach out via [Hugging Face Discussions](https://huggingface.co/eagle0504/qwen-2_5-1_5b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v1/discussions).
171
+
172
+ ---
173
+
174
+ 🎉 **Thank you for using this model!** If you find it useful, please ⭐ it on **Hugging Face**! 🚀🔥
175
+