Ring-mini-2.0
🤗 Hugging Face | 🤖 ModelScope
Introduction
We present a compact yet powerful reasoning model Ring-mini-2.0. It has 16B total parameters, with 1.4B parameters are activated per input token (non-embedding 789M). Although Ring-mini-2.0 is quite compact, it still reaches the top-tier level of sub-10B dense LLMs and even matches or surpasses much larger MoE models, through pre-training on 20T tokens of high-quality data and enhanced through long-cot supervised fine-tuning and multi-stage reinforcement learning.
Model Downloads
Model | #Total Params | #Activated Params | Context Length | Download |
---|---|---|---|---|
Ring-mini-2.0 | 16.8B | 1.4B | 128K | 🤗 HuggingFace |
Ring-lite-2507 | 16.8B | 2.75B | 128K | 🤗 HuggingFace |
Evaluation
For a comprehensive evaluation of the quality of our reasoning models, we implemented automatic benchmarks to assess their performance including math, code and science. The results indicate Ring-mini-2.0 achieves comparable performace with Ring-lite-2507 while activating only half parameters.
Quickstart
🤗 Hugging Face Transformers
Here is a code snippet to show you how to use the chat model with transformers
:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "inclusionAI/Ring-mini-2.0"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language models."
messages = [
{"role": "system", "content": "You are Ring, an assistant created by inclusionAI"},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt", return_token_type_ids=False).to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=8192
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Deployment
Please refer to GitHub
License
This code repository is licensed under the MIT License.
Citation
TODO
- Downloads last month
- 20
Model tree for inclusionAI/Ring-mini-2.0
Unable to build the model tree, the base model loops to the model itself. Learn more.