Mobile-Agent-v3: Foundamental Agents for GUI Automation Paper • 2508.15144 • Published 19 days ago • 61
Perception-Aware Policy Optimization for Multimodal Reasoning Paper • 2507.06448 • Published Jul 8 • 47
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper • 2506.21506 • Published Jun 26 • 51
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published May 23 • 89
QwenLong-CPRS: Towards infty-LLMs with Dynamic Context Optimization Paper • 2505.18092 • Published May 23 • 44
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion Paper • 2503.04222 • Published Mar 6 • 15
FuseChat 3.0 Collection Preference Optimization for Implicit Model Fusion • 14 items • Updated Mar 7 • 14
view article Article FuseO1-Preview: System-II Reasoning Fusion of LLMs By Wanfq and 4 others • Jan 20 • 22
Weighted-Reward Preference Optimization for Implicit Model Fusion Paper • 2412.03187 • Published Dec 4, 2024 • 12