Blog, Articles, and discussions

mmBERT: ModernBERT goes Multilingual

By September 9, 2025 • 92

Community Articles

view all

Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason!

and 1 other •

9 days ago

• 55

AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models

and 4 others •

4 days ago

• 10

"Anemll-style" Root-Mean-Square (RMS) Normalization on the Apple Neural Engine: A Simple Hack

•

4 days ago

• 9

Code a simple RAG from scratch

•

Oct 29, 2024

• 198

How to Train an Antibody Developability Model

and 1 other •

3 days ago

• 7

🌎 What kind of environmental impacts are AI companies disclosing? (And can we compare them?) 🌎

and 1 other •

3 days ago

• 7

Unleashing the Full Potential of ERNIE4.5 using FastDeploy

and 3 others •

1 day ago

• 7

Small Language Models (SLM): A Comprehensive Overview

•

Feb 22

• 68

Use AI on Your PC: Optimize and Deploy a Multimodal Agentic Pipeline on AI PC Powered by Intel

and 2 others •

3 days ago

• 5

Finegrain Product Placement LoRA (experiment)

•

2 days ago

• 5

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

•

Jul 29, 2024

• 360

Decoding Strategies in Large Language Models

•

Oct 29, 2024

• 89

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 218

From GRPO to DAPO and GSPO: What, Why, and How

•

Aug 9

• 28

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

and 5 others •

Jun 11

• 91

Diffusion Language Models: The New Paradigm

•

Jun 10

• 16

🥬 TinyLettuce: Efficient Hallucination Detection with 17–68M Encoders

and 1 other •

20 days ago

• 12

Spread Your Wings: Falcon 180B is here

By September 6, 2023 • 9

Code Llama: Llama 2 learns to code

By August 25, 2023 • 10

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

By August 22, 2023 • 36

Fine-tune Llama 2 with DPO

By August 8, 2023 • 63

Llama 2 is here - get it on Hugging Face

By July 18, 2023 • 30

Open-Source Text Generation & LLM Ecosystem at Hugging Face

By July 17, 2023 • 3

Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2

By June 29, 2023 • 3

What's going on with the Open LLM Leaderboard?

By June 23, 2023 • 43

Can foundation models label data like humans?

By June 12, 2023 • 1

Welcome fastText to the 🤗 Hub

By June 6, 2023 • 5

The Falcon has landed in the Hugging Face ecosystem

By June 5, 2023 • 17

Smaller is better: Q8-Chat, an efficient generative AI experience on Xeon

By May 16, 2023 • 2

Run a Chatgpt-like Chatbot on a Single GPU with ROCm

By May 15, 2023 • 2

Introducing RWKV — An RNN with the advantages of a transformer

By May 15, 2023 • 23

Community Articles

Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason!

and 1 other •

9 days ago

• 55

PP-OCRv5 on Hugging Face: A Specialized Approach to OCR

and 5 others •

10 days ago

• 95

How to Choose the Best Open Source LLM for Your Project in 2025

•

11 days ago

• 68

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

and 1 other •

9 days ago

• 16

AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models

and 4 others •

4 days ago

• 10

"Anemll-style" Root-Mean-Square (RMS) Normalization on the Apple Neural Engine: A Simple Hack

•

4 days ago

• 9

Code a simple RAG from scratch

•

Oct 29, 2024

• 198

How to Train an Antibody Developability Model

and 1 other •

3 days ago

• 7

🌎 What kind of environmental impacts are AI companies disclosing? (And can we compare them?) 🌎

and 1 other •

3 days ago

• 7

Unleashing the Full Potential of ERNIE4.5 using FastDeploy

and 3 others •

1 day ago

• 7

Small Language Models (SLM): A Comprehensive Overview

•

Feb 22

• 68

Use AI on Your PC: Optimize and Deploy a Multimodal Agentic Pipeline on AI PC Powered by Intel

and 2 others •

3 days ago

• 5

Finegrain Product Placement LoRA (experiment)

•

2 days ago

• 5

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

•

Jul 29, 2024

• 360

Decoding Strategies in Large Language Models

•

Oct 29, 2024

• 89

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 218

From GRPO to DAPO and GSPO: What, Why, and How

•

Aug 9

• 28

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

and 5 others •

Jun 11

• 91

Diffusion Language Models: The New Paradigm

•

Jun 10

• 16

🥬 TinyLettuce: Efficient Hallucination Detection with 17–68M Encoders

and 1 other •

20 days ago

• 12

View all