Running on CPU Upgrade Featured 2.97k The Smol Training Playbook 📚 2.97k The secrets to building world-class LLMs
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10, 2025 • 57