Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.02737

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 70
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Paper • 2303.03915 • Published Mar 7, 2023 • 7
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 260
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 90
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 16
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 14

Hugging Face Science team papers

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 70
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199
YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2 • 22
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Research collaborations

A small overview of our research collabs through the years

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 231
Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published Dec 4, 2024 • 21

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

heegyu/gorani-v0

Text Generation • Updated May 3, 2023 • 7
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Hugging Face Science team releases

SmolVLM2 📺 Smallest video LM ever 🤏🏻

Collection

11 items • Updated May 5 • 96
Running

3.16k

3.16k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters
open-r1/OpenR1-Math-220k

Viewer • Updated Feb 18 • 450k • 16.3k • 645
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242
Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 59
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 123
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 70
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Paper • 2303.03915 • Published Mar 7, 2023 • 7
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 260
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 90
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 16
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 14

heegyu/gorani-v0

Text Generation • Updated May 3, 2023 • 7
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Hugging Face Science team papers

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 70
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199
YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2 • 22
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Hugging Face Science team releases

SmolVLM2 📺 Smallest video LM ever 🤏🏻

Collection

11 items • Updated May 5 • 96
Running

3.16k

3.16k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters
open-r1/OpenR1-Math-220k

Viewer • Updated Feb 18 • 450k • 16.3k • 645
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

Research collaborations

A small overview of our research collabs through the years

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 231
Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published Dec 4, 2024 • 21

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242
Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 59
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 123
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

Previous
1
2
3
...
5
Next

Company

TOS Privacy About Jobs

Website

Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略