Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
SoraWatermarkRemover
Log In
Sign Up
Blog, Articles, and discussions
New Article
community
guide
open source collab
partnerships
research
NLP
Audio
CV
RL
ethics
Diffusion
Game Development
RLHF
Leaderboard
Case Studies
LeRobot
Inference Providers
Community Articles
view all
We’re open-sourcing our text-to-image model and the process behind it
11 days ago
•
68
Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models
4 days ago
•
20
Text-to-image Architectural Experiments
9 days ago
•
33
Introducing Cogito v2.1
3 days ago
•
17
Projected Abliteration
28 days ago
•
28
AI Model Optimization More Flexible Than Ever
5 days ago
•
12
The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs
7 days ago
•
11
How to make NeuTTS-air generate over 200 seconds of audio in a single second.
1 day ago
•
10
ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases
18 days ago
•
50
Norm-Preserving Biprojected Abliteration
16 days ago
•
15
The Pharmome Map: a comprehensive public dataset for drug-target interaction modeling
5 days ago
•
9
KV Caching Explained: Optimizing Transformer Inference Efficiency
Jan 30
•
177
To Think or Not to Think: A Router for Hybrid LLMs
6 days ago
•
8
Uncensor any LLM with abliteration
Jun 13, 2024
•
722
🧠 SQaLe: Enabling new Text-to-SQL models with our massive dataset
4 days ago
•
6
Why Did MiniMax M2 End Up as a Full Attention Model?
24 days ago
•
65
The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix
20 days ago
•
41
Code a simple RAG from scratch
Oct 29, 2024
•
248
Visualizing How VLMs Work
Oct 7
•
45
Granite 4.0 Nano: Just how small can you go?
25 days ago
•
119
vllm
grpo
trl
No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL
+2
95
June 3, 2025
vlm
multimodal
trl
Preference Optimization for Vision Language Models
87
July 10, 2024
research
rl
rlhf
Putting RL back in RLHF
107
June 12, 2024
research
rl
rlhf
Constitutional AI with Open LLMs
+3
17
February 1, 2024
rl
rlhf
nlp
Preference Tuning LLMs with Direct Preference Optimization Methods
+1
74
January 18, 2024
research
rl
rlhf
The N Implementation Details of RLHF with PPO
71
October 24, 2023
guide
diffusers
rl
Finetune Stable Diffusion Models with DDPO via TRL
19
September 29, 2023
rl
rlhf
nlp
Fine-tune Llama 2 with DPO
65
August 8, 2023
rl
rlhf
nlp
StackLLaMA: A hands-on guide to train LLaMA with RLHF
+3
46
April 5, 2023
rl
rlhf
nlp
Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU
+2
68
March 9, 2023
llms
rlhf
red-teaming
Red-Teaming Large Language Models
34
February 24, 2023
rlhf
ChatGPT
cot
What Makes a Dialog Agent Useful?
2
January 24, 2023
rlhf
rl
guide
Illustrating Reinforcement Learning from Human Feedback (RLHF)
373
December 9, 2022
Community Articles
Sort: Trending
We’re open-sourcing our text-to-image model and the process behind it
11 days ago
•
68
Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models
4 days ago
•
20
Text-to-image Architectural Experiments
9 days ago
•
33
Introducing Cogito v2.1
3 days ago
•
17
Projected Abliteration
28 days ago
•
28
AI Model Optimization More Flexible Than Ever
5 days ago
•
12
The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs
7 days ago
•
11
How to make NeuTTS-air generate over 200 seconds of audio in a single second.
1 day ago
•
10
ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases
18 days ago
•
50
Norm-Preserving Biprojected Abliteration
16 days ago
•
15
The Pharmome Map: a comprehensive public dataset for drug-target interaction modeling
5 days ago
•
9
KV Caching Explained: Optimizing Transformer Inference Efficiency
Jan 30
•
177
To Think or Not to Think: A Router for Hybrid LLMs
6 days ago
•
8
Uncensor any LLM with abliteration
Jun 13, 2024
•
722
🧠 SQaLe: Enabling new Text-to-SQL models with our massive dataset
4 days ago
•
6
Why Did MiniMax M2 End Up as a Full Attention Model?
24 days ago
•
65
The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix
20 days ago
•
41
Code a simple RAG from scratch
Oct 29, 2024
•
248
Visualizing How VLMs Work
Oct 7
•
45
Granite 4.0 Nano: Just how small can you go?
25 days ago
•
119
View all articles