ARPO The official datasets and model checkpoints of ARPO Agentic Reinforced Policy Optimization Paper • 2507.19849 • Published Jul 26 • 148 dongguanting/Qwen3-8B-ARPO-DeepSearch 8B • Updated Jul 29 • 44 • 1 dongguanting/Qwen3-14B-ARPO-DeepSearch Text Generation • 15B • Updated 28 days ago • 64 • 4 dongguanting/Qwen2.5-7B-ARPO Text Generation • 8B • Updated 21 days ago • 64 • 2
Tool-Star Tool-Star is a reinforcement learning-based framework designed to empower LLMs to autonomously invoke multiple external tools during stepwise reasonin Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57 dongguanting/Tool-Star-SFT-54K Viewer • Updated May 29 • 54k • 290 • 8 dongguanting/Multi-Tool-RL-10K Viewer • Updated May 25 • 10k • 142 • 4 dongguanting/Tool-Star-Qwen-7B Text Generation • 8B • Updated Jun 30 • 20 • 2
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57
ARPO The official datasets and model checkpoints of ARPO Agentic Reinforced Policy Optimization Paper • 2507.19849 • Published Jul 26 • 148 dongguanting/Qwen3-8B-ARPO-DeepSearch 8B • Updated Jul 29 • 44 • 1 dongguanting/Qwen3-14B-ARPO-DeepSearch Text Generation • 15B • Updated 28 days ago • 64 • 4 dongguanting/Qwen2.5-7B-ARPO Text Generation • 8B • Updated 21 days ago • 64 • 2
Tool-Star Tool-Star is a reinforcement learning-based framework designed to empower LLMs to autonomously invoke multiple external tools during stepwise reasonin Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57 dongguanting/Tool-Star-SFT-54K Viewer • Updated May 29 • 54k • 290 • 8 dongguanting/Multi-Tool-RL-10K Viewer • Updated May 25 • 10k • 142 • 4 dongguanting/Tool-Star-Qwen-7B Text Generation • 8B • Updated Jun 30 • 20 • 2
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57