Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ tags:
|
|
14 |
- physicswallah
|
15 |
language:
|
16 |
- en
|
17 |
-
model_name: PhysicsWallah/
|
18 |
model_creator: Physics Wallah AI Research
|
19 |
model_type: Causal decoder-based model
|
20 |
base_model: Qwen/Qwen2.5-Math-7B
|
@@ -23,10 +23,10 @@ pipeline_tag: text-generation
|
|
23 |
|
24 |
# Aryabhatta 1.0 🌟
|
25 |
|
26 |
-
**
|
27 |
|
28 |
|
29 |
-
> 🚧 *
|
30 |
---
|
31 |
|
32 |
## 🧠 Key Features
|
@@ -51,7 +51,7 @@ pipeline_tag: text-generation
|
|
51 |
- **Reinforcement Learning with Verifiable Rewards (RLVR)**
|
52 |
|
53 |
### 🔀 Model Merging
|
54 |
-
We began with model merging (Weighted average) to build a strong initialization (
|
55 |
* Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
|
56 |
* Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
|
57 |
* DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
|
@@ -63,7 +63,7 @@ We extracted ~250K raw questions from Physics Wallah's internal database and app
|
|
63 |
Final curated dataset: ~130K high-quality questions.
|
64 |
|
65 |
For each question:
|
66 |
-
* Generated 4 CoTs using
|
67 |
* Retained only those leading to correct final answers.
|
68 |
|
69 |
Resulting Dataset:
|
@@ -79,7 +79,7 @@ We used a custom in-house variant of Group Relative Policy Optimization (GRPO),
|
|
79 |
|
80 |
We used RLVR on the remaining ~30K questions.
|
81 |
|
82 |
-
This multi-phase training strategy allows
|
83 |
|
84 |
---
|
85 |
|
@@ -111,11 +111,11 @@ We used a composite evaluation metric to reflect real-world grading rigor and re
|
|
111 |
|
112 |
### 🔹 Accuracy Comparison Across Models
|
113 |

|
114 |
-
> *
|
115 |
|
116 |
### 🔹 Accuracy vs Token Usage
|
117 |

|
118 |
-
> *
|
119 |
|
120 |
---
|
121 |
|
@@ -134,7 +134,7 @@ We used a composite evaluation metric to reflect real-world grading rigor and re
|
|
134 |
```python
|
135 |
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
136 |
|
137 |
-
model_id = "PhysicsWallahAI/
|
138 |
|
139 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
140 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
@@ -181,7 +181,7 @@ To run the model efficiently using vLLM:
|
|
181 |
from vllm import LLM, SamplingParams
|
182 |
|
183 |
# Initialize model (downloads from Hugging Face if not local)
|
184 |
-
llm = LLM(model="PhysicsWallahAI/
|
185 |
|
186 |
# Define prompt and sampling configuration
|
187 |
query = 'Find all the values of \\sqrt[3]{1}'
|
@@ -200,7 +200,7 @@ print(results[0].outputs[0].text.strip())
|
|
200 |
|
201 |
## 🚀 Roadmap
|
202 |
|
203 |
-
**
|
204 |
- Extending domain coverage to **Physics** and **Chemistry**
|
205 |
- Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
|
206 |
- Further optimization for affordability and accuracy in real-time deployments
|
@@ -212,9 +212,9 @@ print(results[0].outputs[0].text.strip())
|
|
212 |
If you use this model, please cite:
|
213 |
|
214 |
```bibtex
|
215 |
-
@misc{
|
216 |
-
title = {
|
217 |
author = {Physics Wallah AI Research},
|
218 |
year = {2025},
|
219 |
-
note = {\url{https://huggingface.co/PhysicsWallahAI/
|
220 |
}
|
|
|
14 |
- physicswallah
|
15 |
language:
|
16 |
- en
|
17 |
+
model_name: PhysicsWallah/Aryabhata-1.0
|
18 |
model_creator: Physics Wallah AI Research
|
19 |
model_type: Causal decoder-based model
|
20 |
base_model: Qwen/Qwen2.5-Math-7B
|
|
|
23 |
|
24 |
# Aryabhatta 1.0 🌟
|
25 |
|
26 |
+
**Aryabhata 1.0** is a 7B parameter small language model for mathematics developed by **Physics Wallah AI Research**, optimized for high-stakes Indian competitive exams like **JEE Mains**. Despite its compact size, Aryabhata 1.0 achieves **state-of-the-art performance** on exam-centric reasoning tasks with impressive **token efficiency** and low inference cost.
|
27 |
|
28 |
|
29 |
+
> 🚧 *Aryabhata 1.0 is an **experimental release**. We are actively seeking feedback — please contribute in the Discussion tab of this repo.*
|
30 |
---
|
31 |
|
32 |
## 🧠 Key Features
|
|
|
51 |
- **Reinforcement Learning with Verifiable Rewards (RLVR)**
|
52 |
|
53 |
### 🔀 Model Merging
|
54 |
+
We began with model merging (Weighted average) to build a strong initialization (Aryabhata 0.5) by combining diverse model capabilities:
|
55 |
* Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
|
56 |
* Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
|
57 |
* DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
|
|
|
63 |
Final curated dataset: ~130K high-quality questions.
|
64 |
|
65 |
For each question:
|
66 |
+
* Generated 4 CoTs using Aryabhata 0.5.
|
67 |
* Retained only those leading to correct final answers.
|
68 |
|
69 |
Resulting Dataset:
|
|
|
79 |
|
80 |
We used RLVR on the remaining ~30K questions.
|
81 |
|
82 |
+
This multi-phase training strategy allows Aryabhata 1.0 to capture **pedagogy-aligned reasoning patterns**, making it highly effective for solving real student queries in mathematics.
|
83 |
|
84 |
---
|
85 |
|
|
|
111 |
|
112 |
### 🔹 Accuracy Comparison Across Models
|
113 |

|
114 |
+
> *Aryabhata has the best accuracy on JEE Main Maths, on par with frontier models*
|
115 |
|
116 |
### 🔹 Accuracy vs Token Usage
|
117 |

|
118 |
+
> *Aryabhata is on par with frontier models in terms of accuracy vs token usage*
|
119 |
|
120 |
---
|
121 |
|
|
|
134 |
```python
|
135 |
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
136 |
|
137 |
+
model_id = "PhysicsWallahAI/Aryabhata-1.0"
|
138 |
|
139 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
140 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
|
|
181 |
from vllm import LLM, SamplingParams
|
182 |
|
183 |
# Initialize model (downloads from Hugging Face if not local)
|
184 |
+
llm = LLM(model="PhysicsWallahAI/Aryabhata-1.0")
|
185 |
|
186 |
# Define prompt and sampling configuration
|
187 |
query = 'Find all the values of \\sqrt[3]{1}'
|
|
|
200 |
|
201 |
## 🚀 Roadmap
|
202 |
|
203 |
+
**Aryabhata 2.0** (Upcoming):
|
204 |
- Extending domain coverage to **Physics** and **Chemistry**
|
205 |
- Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
|
206 |
- Further optimization for affordability and accuracy in real-time deployments
|
|
|
212 |
If you use this model, please cite:
|
213 |
|
214 |
```bibtex
|
215 |
+
@misc{Aryabhata2025,
|
216 |
+
title = {Aryabhata 1.0: A compact, exam-focused language model tailored for mathematics in Indian competitive exams, especially JEE Main.},
|
217 |
author = {Physics Wallah AI Research},
|
218 |
year = {2025},
|
219 |
+
note = {\url{https://huggingface.co/PhysicsWallahAI/Aryabhata-1.0}},
|
220 |
}
|