-
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 294 -
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
Paper • 2507.15758 • Published • 34 -
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
Paper • 2508.09726 • Published • 13 -
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining
Paper • 2508.10975 • Published • 57
Rafael Coelho de Souza Krzonkalla
krzonkalla
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 2 hours ago
Set Block Decoding is a Language Model Inference Accelerator
updated
a collection
about 2 hours ago
relevant_papers
upvoted
a
paper
about 18 hours ago
Why Language Models Hallucinate
Organizations
None yet