pw-ai-research commited on
Commit
1593799
·
verified ·
1 Parent(s): e4a6f6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
  - physicswallah
15
  language:
16
  - en
17
- model_name: PhysicsWallah/Aryabhatta-1.0
18
  model_creator: Physics Wallah AI Research
19
  model_type: Causal decoder-based model
20
  base_model: Qwen/Qwen2.5-Math-7B
@@ -23,10 +23,10 @@ pipeline_tag: text-generation
23
 
24
  # Aryabhatta 1.0 🌟
25
 
26
- **Aryabhatta 1.0** is a 7B parameter small language model for mathematics developed by **Physics Wallah AI Research**, optimized for high-stakes Indian competitive exams like **JEE Mains**. Despite its compact size, Aryabhatta 1.0 achieves **state-of-the-art performance** on exam-centric reasoning tasks with impressive **token efficiency** and low inference cost.
27
 
28
 
29
- > 🚧 *Aryabhatta 1.0 is an **experimental release**. We are actively seeking feedback — please contribute in the Discussion tab of this repo.*
30
  ---
31
 
32
  ## 🧠 Key Features
@@ -51,7 +51,7 @@ pipeline_tag: text-generation
51
  - **Reinforcement Learning with Verifiable Rewards (RLVR)**
52
 
53
  ### 🔀 Model Merging
54
- We began with model merging (Weighted average) to build a strong initialization (Aryabhatta 0.5) by combining diverse model capabilities:
55
  * Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
56
  * Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
57
  * DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
@@ -63,7 +63,7 @@ We extracted ~250K raw questions from Physics Wallah's internal database and app
63
  Final curated dataset: ~130K high-quality questions.
64
 
65
  For each question:
66
- * Generated 4 CoTs using Aryabhatta 0.5.
67
  * Retained only those leading to correct final answers.
68
 
69
  Resulting Dataset:
@@ -79,7 +79,7 @@ We used a custom in-house variant of Group Relative Policy Optimization (GRPO),
79
 
80
  We used RLVR on the remaining ~30K questions.
81
 
82
- This multi-phase training strategy allows Aryabhatta 1.0 to capture **pedagogy-aligned reasoning patterns**, making it highly effective for solving real student queries in mathematics.
83
 
84
  ---
85
 
@@ -111,11 +111,11 @@ We used a composite evaluation metric to reflect real-world grading rigor and re
111
 
112
  ### 🔹 Accuracy Comparison Across Models
113
  ![](accuracy.png)
114
- > *Aryabhatta has the best accuracy on JEE Main Maths, on par with frontier models*
115
 
116
  ### 🔹 Accuracy vs Token Usage
117
  ![](accuracy-vs-token.png)
118
- > *Aryabhatta is on par with frontier models in terms of accuracy vs token usage*
119
 
120
  ---
121
 
@@ -134,7 +134,7 @@ We used a composite evaluation metric to reflect real-world grading rigor and re
134
  ```python
135
  from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
136
 
137
- model_id = "PhysicsWallahAI/Aryabhatta-1.0"
138
 
139
  tokenizer = AutoTokenizer.from_pretrained(model_id)
140
  model = AutoModelForCausalLM.from_pretrained(model_id)
@@ -181,7 +181,7 @@ To run the model efficiently using vLLM:
181
  from vllm import LLM, SamplingParams
182
 
183
  # Initialize model (downloads from Hugging Face if not local)
184
- llm = LLM(model="PhysicsWallahAI/Aryabhatta-1.0")
185
 
186
  # Define prompt and sampling configuration
187
  query = 'Find all the values of \\sqrt[3]{1}'
@@ -200,7 +200,7 @@ print(results[0].outputs[0].text.strip())
200
 
201
  ## 🚀 Roadmap
202
 
203
- **Aryabhatta 2.0** (Upcoming):
204
  - Extending domain coverage to **Physics** and **Chemistry**
205
  - Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
206
  - Further optimization for affordability and accuracy in real-time deployments
@@ -212,9 +212,9 @@ print(results[0].outputs[0].text.strip())
212
  If you use this model, please cite:
213
 
214
  ```bibtex
215
- @misc{aryabhatta2025,
216
- title = {Aryabhatta 1.0: A compact, exam-focused language model tailored for mathematics in Indian competitive exams, especially JEE Main.},
217
  author = {Physics Wallah AI Research},
218
  year = {2025},
219
- note = {\url{https://huggingface.co/PhysicsWallahAI/Aryabhatta-1.0}},
220
  }
 
14
  - physicswallah
15
  language:
16
  - en
17
+ model_name: PhysicsWallah/Aryabhata-1.0
18
  model_creator: Physics Wallah AI Research
19
  model_type: Causal decoder-based model
20
  base_model: Qwen/Qwen2.5-Math-7B
 
23
 
24
  # Aryabhatta 1.0 🌟
25
 
26
+ **Aryabhata 1.0** is a 7B parameter small language model for mathematics developed by **Physics Wallah AI Research**, optimized for high-stakes Indian competitive exams like **JEE Mains**. Despite its compact size, Aryabhata 1.0 achieves **state-of-the-art performance** on exam-centric reasoning tasks with impressive **token efficiency** and low inference cost.
27
 
28
 
29
+ > 🚧 *Aryabhata 1.0 is an **experimental release**. We are actively seeking feedback — please contribute in the Discussion tab of this repo.*
30
  ---
31
 
32
  ## 🧠 Key Features
 
51
  - **Reinforcement Learning with Verifiable Rewards (RLVR)**
52
 
53
  ### 🔀 Model Merging
54
+ We began with model merging (Weighted average) to build a strong initialization (Aryabhata 0.5) by combining diverse model capabilities:
55
  * Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
56
  * Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
57
  * DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
 
63
  Final curated dataset: ~130K high-quality questions.
64
 
65
  For each question:
66
+ * Generated 4 CoTs using Aryabhata 0.5.
67
  * Retained only those leading to correct final answers.
68
 
69
  Resulting Dataset:
 
79
 
80
  We used RLVR on the remaining ~30K questions.
81
 
82
+ This multi-phase training strategy allows Aryabhata 1.0 to capture **pedagogy-aligned reasoning patterns**, making it highly effective for solving real student queries in mathematics.
83
 
84
  ---
85
 
 
111
 
112
  ### 🔹 Accuracy Comparison Across Models
113
  ![](accuracy.png)
114
+ > *Aryabhata has the best accuracy on JEE Main Maths, on par with frontier models*
115
 
116
  ### 🔹 Accuracy vs Token Usage
117
  ![](accuracy-vs-token.png)
118
+ > *Aryabhata is on par with frontier models in terms of accuracy vs token usage*
119
 
120
  ---
121
 
 
134
  ```python
135
  from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
136
 
137
+ model_id = "PhysicsWallahAI/Aryabhata-1.0"
138
 
139
  tokenizer = AutoTokenizer.from_pretrained(model_id)
140
  model = AutoModelForCausalLM.from_pretrained(model_id)
 
181
  from vllm import LLM, SamplingParams
182
 
183
  # Initialize model (downloads from Hugging Face if not local)
184
+ llm = LLM(model="PhysicsWallahAI/Aryabhata-1.0")
185
 
186
  # Define prompt and sampling configuration
187
  query = 'Find all the values of \\sqrt[3]{1}'
 
200
 
201
  ## 🚀 Roadmap
202
 
203
+ **Aryabhata 2.0** (Upcoming):
204
  - Extending domain coverage to **Physics** and **Chemistry**
205
  - Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
206
  - Further optimization for affordability and accuracy in real-time deployments
 
212
  If you use this model, please cite:
213
 
214
  ```bibtex
215
+ @misc{Aryabhata2025,
216
+ title = {Aryabhata 1.0: A compact, exam-focused language model tailored for mathematics in Indian competitive exams, especially JEE Main.},
217
  author = {Physics Wallah AI Research},
218
  year = {2025},
219
+ note = {\url{https://huggingface.co/PhysicsWallahAI/Aryabhata-1.0}},
220
  }