Prince-1 commited on
Commit
a2d7d40
·
verified ·
1 Parent(s): 5b3f91b

Add files using upload-large-folder tool

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ model.onnx.data filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - small-language-model
5
+ - jee
6
+ - exam-centric
7
+ - indian-education
8
+ - reinforcement-learning
9
+ - supervised-finetuning
10
+ - model-merging
11
+ - rejection-sampling
12
+ - mathematics
13
+ - ai4education
14
+ - physicswallah
15
+ - onnx
16
+ - onnxruntime-genai
17
+ - onnxruntime
18
+ language:
19
+ - en
20
+ library_name: onnxruntime-genai
21
+ base_model_relation: quantized
22
+ base_model: Prince-1/Aryabhata-1.0
23
+ pipeline_tag: text-generation
24
+ model_creator: Physics Wallah AI Research
25
+ model_type: Causal decoder-based model
26
+ ---
27
+
28
+ # Aryabhatta 1.0 : An exam-focused language model for JEE Math
29
+
30
+ ![](benchmark.png)
31
+
32
+ ## Overview
33
+
34
+ **Aryabhata 1.0** is a 7B parameter small language model for mathematics developed by **Physics Wallah AI Research**, optimized for high-stakes Indian competitive exams like **JEE Mains**. Despite its compact size, Aryabhata 1.0 achieves **state-of-the-art performance** on exam-centric reasoning tasks with impressive **token efficiency** and low inference cost.
35
+
36
+
37
+ > 🚧 *Aryabhata 1.0 is an **experimental release**. We are actively seeking feedback — please contribute in the Discussion tab of this repo.*
38
+ ---
39
+
40
+ ## 🧠 Key Features
41
+
42
+ - **Architecture**: 7B parameter causal decoder-based model.
43
+ - **Exam-Centric Optimization**: Specifically tuned for JEE-level Mathematics reasoning.
44
+ - **High Accuracy**:
45
+ - **86%** on **JEE Mains January 2025** session.
46
+ - **90.2%** on **JEE Mains April 2025** session.
47
+ - **Token Efficiency**: Operates effectively around a **~2K token window**, compared to ~8K required by other reasoning models.
48
+ - **Compute Efficient**: Trained on a **1x2 NVIDIA H100 GPU** using optimized pipeline.
49
+
50
+ ---
51
+
52
+ ## 🛠️ Training Details
53
+
54
+ - **Training Data**: ~130K problem-solution pairs curated from proprietary Physics Wallah exam datasets.
55
+ - **Training Pipeline**:
56
+ - **Model Merging**
57
+ - **Rejection Sampling**
58
+ - **Supervised Fine-Tuning (SFT)**
59
+ - **Reinforcement Learning with Verifiable Rewards (RLVR)**
60
+
61
+ ### 🔀 Model Merging
62
+ We began with model merging (Weighted average) to build a strong initialization (Aryabhata 0.5) by combining diverse model capabilities:
63
+ * Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
64
+ * Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
65
+ * DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
66
+
67
+ ### 📚 Data Curation + Rejection Sampling
68
+ We extracted ~250K raw questions from Physics Wallah's internal database and applied aggressive filtering and cleaning:
69
+ * Removed: diagram-based, non-English, and option-heavy questions.
70
+ * Kept: questions matching the distribution of JEE Main 2019–2024.
71
+ Final curated dataset: ~130K high-quality questions.
72
+
73
+ For each question:
74
+ * Generated 4 CoTs using Aryabhata 0.5.
75
+ * Retained only those leading to correct final answers.
76
+
77
+ Resulting Dataset:
78
+ * ~100K questions
79
+ * ~350K high-quality CoTs
80
+
81
+ We used this dataset for SFT.
82
+
83
+ ### 🎯 Reinforcement Learning with Verifiable Rewards (RLVR)
84
+ We used a custom in-house variant of Group Relative Policy Optimization (GRPO), adapted for math-specific reward functions.
85
+ * Removed KL-divergence penalty
86
+ * Removed clipping
87
+
88
+ We used RLVR on the remaining ~30K questions.
89
+
90
+ This multi-phase training strategy allows Aryabhata 1.0 to capture **pedagogy-aligned reasoning patterns**, making it highly effective for solving real student queries in mathematics.
91
+
92
+ ---
93
+
94
+ ## 📊 Performance Highlights
95
+
96
+ ### Evaluation Setup
97
+ All evaluations were performed with temperature = 0.0, and we report pass@1 accuracy.
98
+
99
+ #### Evaluation Datasets
100
+ We evaluated the model on two sets of official JEE Mains 2025 mathematics papers:
101
+ * January Session: 10 question papers containing 250 questions.
102
+ * April Session: 9 question papers containing 225 questions.
103
+
104
+ Each paper includes a mix of:
105
+ * Multiple Choice Questions (MCQs) with one correct option
106
+ * Numeric Answer Type (NAT) questions requiring precise numerical responses
107
+
108
+ #### Evaluation Metric
109
+ We used a composite evaluation metric to reflect real-world grading rigor and reduce false positives:
110
+
111
+ 1. Float Match
112
+ * Compares predicted and target answers within a tolerance (±1e-9)
113
+ * Handles rounding artifacts and small numerical errors robustly
114
+ 2. String Match
115
+ * Used for symbolic answers (e.g., fractions, radicals)
116
+ * Uses strict exact match — predictions must match ground truth character-for-character
117
+ 3. LLM-as-Judge (GPT-4o-mini)
118
+ * Used for Mathematical equivalence for ambiguous formats
119
+
120
+ ### 🔹 Accuracy Comparison Across Models
121
+ ![](accuracy.png)
122
+ > *Aryabhata has the best accuracy on JEE Main Maths, on par with frontier models*
123
+
124
+ ### 🔹 Accuracy vs Token Usage
125
+ ![](accuracy-vs-token.png)
126
+ > *Aryabhata is on par with frontier models in terms of accuracy vs token usage*
127
+
128
+ ---
129
+
130
+ ## 🔧 Intended Use
131
+
132
+ **Primary Use Cases**:
133
+ - Competitive exam preparation (JEE Main level mathematics problems)
134
+ - Question answering and doubt-solving systems
135
+ - Educational tutoring and concept explanation
136
+
137
+
138
+ ## 💡 How to Use
139
+
140
+ ### 🧪 Using with 🤗 Transformers
141
+
142
+ ```python
143
+ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
144
+
145
+ model_id = "PhysicsWallahAI/Aryabhata-1.0"
146
+
147
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
148
+ model = AutoModelForCausalLM.from_pretrained(model_id)
149
+
150
+
151
+ # Define stop strings
152
+ stop_strings = ["<|im_end|>", "<|end|>", "<im_start|>", "⁠```python\n", "⁠<|im_start|>", "]}}]}}]"]
153
+
154
+ def strip_bad_tokens(s, stop_strings):
155
+ for suffix in stop_strings:
156
+ if s.endswith(suffix):
157
+ return s[:-len(suffix)]
158
+ return s
159
+
160
+
161
+ # Create generation config (can also set temperature, top_p, etc.)
162
+ generation_config = GenerationConfig(
163
+ max_new_tokens=4096,
164
+ stop_strings = stop_strings
165
+ )
166
+
167
+ query = 'Find all the values of \\sqrt[3]{1}'
168
+ messages = [{'role': 'system', 'content': 'Think step-by-step; put only the final answer inside \\boxed{}.'},
169
+ {'role': 'user', 'content': query}]
170
+
171
+ text = tokenizer.apply_chat_template(
172
+ messages,
173
+ tokenize=False,
174
+ add_generation_prompt=True
175
+ )
176
+ inputs = tokenizer([text], return_tensors="pt")
177
+ outputs = model.generate(**inputs, generation_config=generation_config, tokenizer=tokenizer)
178
+
179
+ print(strip_bad_tokens(tokenizer.decode(outputs[0], skip_special_tokens=True), stop_strings))
180
+ ````
181
+
182
+ ---
183
+
184
+ ### ⚡ Using with vLLM
185
+
186
+ To run the model efficiently using vLLM:
187
+
188
+ ```python
189
+ from vllm import LLM, SamplingParams
190
+
191
+ # Initialize model (downloads from Hugging Face if not local)
192
+ llm = LLM(model="PhysicsWallahAI/Aryabhata-1.0")
193
+
194
+ # Define prompt and sampling configuration
195
+ query = 'Find all the values of \\sqrt[3]{1}'
196
+ messages = [{'role': 'system', 'content': 'Think step-by-step; put only the final answer inside \\boxed{}.'},
197
+ {'role': 'user', 'content': query}]
198
+ sampling_params = SamplingParams(temperature=0.0, max_tokens=4*1024, stop=["<|im_end|>", "<|end|>", "<im_start|>", "⁠```python\n", "⁠<|im_start|>", "]}}]}}]"])
199
+
200
+ # Run inference
201
+ results = llm.chat(messages, sampling_params)
202
+
203
+ # Print result
204
+ print(results[0].outputs[0].text.strip())
205
+ ```
206
+
207
+ ---
208
+
209
+ ## 🚀 Roadmap
210
+
211
+ **Aryabhata 2.0** (Upcoming):
212
+ - Extending domain coverage to **Physics** and **Chemistry**
213
+ - Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
214
+ - Further optimization for affordability and accuracy in real-time deployments
215
+
216
+ ---
217
+
218
+ ## 🤝 Citation
219
+
220
+ If you use this model, please cite:
221
+
222
+ ```bibtex
223
+ @misc{Aryabhata2025,
224
+ title = {Aryabhata 1.0: A compact, exam-focused language model tailored for mathematics in Indian competitive exams, especially JEE Main.},
225
+ author = {Physics Wallah AI Research},
226
+ year = {2025},
227
+ note = {\url{https://huggingface.co/PhysicsWallahAI/Aryabhata-1.0}},
228
+ }
chat_template.jinja ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if tools %}
2
+ {{- '<|im_start|>system\n' }}
3
+ {%- if messages[0]['role'] == 'system' %}
4
+ {{- messages[0]['content'] }}
5
+ {%- else %}
6
+ {{- 'Please reason step by step, and put your final answer within \\boxed{}.' }}
7
+ {%- endif %}
8
+ {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
9
+ {%- for tool in tools %}
10
+ {{- "\n" }}
11
+ {{- tool | tojson }}
12
+ {%- endfor %}
13
+ {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
14
+ {%- else %}
15
+ {%- if messages[0]['role'] == 'system' %}
16
+ {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
17
+ {%- else %}
18
+ {{- '<|im_start|>system\nPlease reason step by step, and put your final answer within \\boxed{}.<|im_end|>\n' }}
19
+ {%- endif %}
20
+ {%- endif %}
21
+ {%- for message in messages %}
22
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
23
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
24
+ {%- elif message.role == "assistant" %}
25
+ {{- '<|im_start|>' + message.role }}
26
+ {%- if message.content %}
27
+ {{- '\n' + message.content }}
28
+ {%- endif %}
29
+ {%- for tool_call in message.tool_calls %}
30
+ {%- if tool_call.function is defined %}
31
+ {%- set tool_call = tool_call.function %}
32
+ {%- endif %}
33
+ {{- '\n<tool_call>\n{"name": "' }}
34
+ {{- tool_call.name }}
35
+ {{- '", "arguments": ' }}
36
+ {{- tool_call.arguments | tojson }}
37
+ {{- '}\n</tool_call>' }}
38
+ {%- endfor %}
39
+ {{- '<|im_end|>\n' }}
40
+ {%- elif message.role == "tool" %}
41
+ {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
42
+ {{- '<|im_start|>user' }}
43
+ {%- endif %}
44
+ {{- '\n<tool_response>\n' }}
45
+ {{- message.content }}
46
+ {{- '\n</tool_response>' }}
47
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
48
+ {{- '<|im_end|>\n' }}
49
+ {%- endif %}
50
+ {%- endif %}
51
+ {%- endfor %}
52
+ {%- if add_generation_prompt %}
53
+ {{- '<|im_start|>assistant\n' }}
54
+ {%- endif %}
genai_config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model": {
3
+ "bos_token_id": 151643,
4
+ "context_length": 131072,
5
+ "decoder": {
6
+ "session_options": {
7
+ "log_id": "onnxruntime-genai",
8
+ "provider_options": []
9
+ },
10
+ "filename": "model.onnx",
11
+ "head_size": 128,
12
+ "hidden_size": 3584,
13
+ "inputs": {
14
+ "input_ids": "input_ids",
15
+ "attention_mask": "attention_mask",
16
+ "position_ids": "position_ids",
17
+ "past_key_names": "past_key_values.%d.key",
18
+ "past_value_names": "past_key_values.%d.value"
19
+ },
20
+ "outputs": {
21
+ "logits": "logits",
22
+ "present_key_names": "present.%d.key",
23
+ "present_value_names": "present.%d.value"
24
+ },
25
+ "num_attention_heads": 28,
26
+ "num_hidden_layers": 28,
27
+ "num_key_value_heads": 4
28
+ },
29
+ "eos_token_id": 151643,
30
+ "pad_token_id": 151643,
31
+ "type": "qwen2",
32
+ "vocab_size": 152064
33
+ },
34
+ "search": {
35
+ "diversity_penalty": 0.0,
36
+ "do_sample": false,
37
+ "early_stopping": true,
38
+ "length_penalty": 1.0,
39
+ "max_length": 131072,
40
+ "min_length": 0,
41
+ "no_repeat_ngram_size": 0,
42
+ "num_beams": 1,
43
+ "num_return_sequences": 1,
44
+ "past_present_share_buffer": false,
45
+ "repetition_penalty": 1.0,
46
+ "temperature": 1.0,
47
+ "top_k": 1,
48
+ "top_p": 1.0
49
+ }
50
+ }
model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:395e73c4753dfcd56a866f02704e187020507a2c3312cb3bcbe56465000f5100
3
+ size 688147
model.onnx.data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e4dca4d35cceb00909023e30a767da00b18af004a05844639813fc283f093e8
3
+ size 15264787456
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|begin▁of▁sentence|>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|end▁of▁sentence|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<|end▁of▁sentence|>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e20ddafc659ba90242154b55275402edeca0715e5dbb30f56815a4ce081f4893
3
+ size 11422778
tokenizer_config.json ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "151643": {
7
+ "content": "<|end▁of▁sentence|>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "151644": {
15
+ "content": "<|User|>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": false
21
+ },
22
+ "151645": {
23
+ "content": "<|Assistant|>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": false
29
+ },
30
+ "151646": {
31
+ "content": "<|begin▁of▁sentence|>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": true
37
+ },
38
+ "151647": {
39
+ "content": "<|EOT|>",
40
+ "lstrip": false,
41
+ "normalized": false,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": false
45
+ },
46
+ "151648": {
47
+ "content": "<think>",
48
+ "lstrip": false,
49
+ "normalized": false,
50
+ "rstrip": false,
51
+ "single_word": false,
52
+ "special": false
53
+ },
54
+ "151649": {
55
+ "content": "</think>",
56
+ "lstrip": false,
57
+ "normalized": false,
58
+ "rstrip": false,
59
+ "single_word": false,
60
+ "special": false
61
+ },
62
+ "151650": {
63
+ "content": "<|quad_start|>",
64
+ "lstrip": false,
65
+ "normalized": false,
66
+ "rstrip": false,
67
+ "single_word": false,
68
+ "special": true
69
+ },
70
+ "151651": {
71
+ "content": "<|quad_end|>",
72
+ "lstrip": false,
73
+ "normalized": false,
74
+ "rstrip": false,
75
+ "single_word": false,
76
+ "special": true
77
+ },
78
+ "151652": {
79
+ "content": "<|vision_start|>",
80
+ "lstrip": false,
81
+ "normalized": false,
82
+ "rstrip": false,
83
+ "single_word": false,
84
+ "special": true
85
+ },
86
+ "151653": {
87
+ "content": "<|vision_end|>",
88
+ "lstrip": false,
89
+ "normalized": false,
90
+ "rstrip": false,
91
+ "single_word": false,
92
+ "special": true
93
+ },
94
+ "151654": {
95
+ "content": "<|vision_pad|>",
96
+ "lstrip": false,
97
+ "normalized": false,
98
+ "rstrip": false,
99
+ "single_word": false,
100
+ "special": true
101
+ },
102
+ "151655": {
103
+ "content": "<|image_pad|>",
104
+ "lstrip": false,
105
+ "normalized": false,
106
+ "rstrip": false,
107
+ "single_word": false,
108
+ "special": true
109
+ },
110
+ "151656": {
111
+ "content": "<|video_pad|>",
112
+ "lstrip": false,
113
+ "normalized": false,
114
+ "rstrip": false,
115
+ "single_word": false,
116
+ "special": true
117
+ },
118
+ "151657": {
119
+ "content": "<tool_call>",
120
+ "lstrip": false,
121
+ "normalized": false,
122
+ "rstrip": false,
123
+ "single_word": false,
124
+ "special": false
125
+ },
126
+ "151658": {
127
+ "content": "</tool_call>",
128
+ "lstrip": false,
129
+ "normalized": false,
130
+ "rstrip": false,
131
+ "single_word": false,
132
+ "special": false
133
+ },
134
+ "151659": {
135
+ "content": "<|fim_prefix|>",
136
+ "lstrip": false,
137
+ "normalized": false,
138
+ "rstrip": false,
139
+ "single_word": false,
140
+ "special": false
141
+ },
142
+ "151660": {
143
+ "content": "<|fim_middle|>",
144
+ "lstrip": false,
145
+ "normalized": false,
146
+ "rstrip": false,
147
+ "single_word": false,
148
+ "special": false
149
+ },
150
+ "151661": {
151
+ "content": "<|fim_suffix|>",
152
+ "lstrip": false,
153
+ "normalized": false,
154
+ "rstrip": false,
155
+ "single_word": false,
156
+ "special": false
157
+ },
158
+ "151662": {
159
+ "content": "<|fim_pad|>",
160
+ "lstrip": false,
161
+ "normalized": false,
162
+ "rstrip": false,
163
+ "single_word": false,
164
+ "special": false
165
+ },
166
+ "151663": {
167
+ "content": "<|repo_name|>",
168
+ "lstrip": false,
169
+ "normalized": false,
170
+ "rstrip": false,
171
+ "single_word": false,
172
+ "special": false
173
+ },
174
+ "151664": {
175
+ "content": "<|file_sep|>",
176
+ "lstrip": false,
177
+ "normalized": false,
178
+ "rstrip": false,
179
+ "single_word": false,
180
+ "special": false
181
+ }
182
+ },
183
+ "bos_token": "<|begin▁of▁sentence|>",
184
+ "clean_up_tokenization_spaces": false,
185
+ "eos_token": "<|end▁of▁sentence|>",
186
+ "extra_special_tokens": {},
187
+ "legacy": true,
188
+ "model_max_length": 16384,
189
+ "pad_token": "<|end▁of▁sentence|>",
190
+ "sp_model_kwargs": {},
191
+ "tokenizer_class": "LlamaTokenizerFast",
192
+ "unk_token": null,
193
+ "use_default_system_prompt": false
194
+ }