Submitted by henggg 156 The Landscape of Agentic Reinforcement Learning for LLMs: A Survey · 25 authors 343 2
Submitted by lovesnowbest 105 UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning · 106 authors 4
Submitted by SivilTaram 76 SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning · 7 authors 250 2
Submitted by taesiri 74 LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model · 7 authors 4.19k 1
Submitted by DongfuJiang 60 VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use · 12 authors 468 4
Submitted by HLSv 53 ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding · 8 authors 7 1
Submitted by YuanLiuuuuuu 42 POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion · 11 authors 3
Submitted by rishiraj 37 Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling · 1 authors 12 4
Submitted by hammh0a 35 Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic · 3 authors 1
Submitted by fairyang 33 Baichuan-M2: Scaling Medical Capability with Large Verifier System · 34 authors 2
Submitted by Yanqing0327 26 OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning · 7 authors 355 2
Submitted by dogtooth 23 Jointly Reinforcing Diversity and Quality in Language Model Generations · 8 authors 1
Submitted by Geaming 23 Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR · 8 authors 21 2
Submitted by Xiaoyu521 22 GenCompositor: Generative Video Compositing with Diffusion Transformer · 7 authors 89 4
Submitted by Andron00e 22 Benchmarking Optimizers for Large Language Model Pretraining · 3 authors 15 1
Submitted by nsjain 18 DynaGuard: A Dynamic Guardrail Model With User-Defined Policies · 10 authors 8 2
Submitted by ahnpersie 18 FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games · 7 authors 9 1
Submitted by orionweller 16 On the Theoretical Limitations of Embedding-Based Retrieval · 4 authors 1
Submitted by kwangju 13 Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation · 3 authors 1
Submitted by che111 11 M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision · 8 authors 1
Submitted by yulongchen 10 The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang · 6 authors 1
Submitted by fengerhu 6 MobiAgent: A Systematic Framework for Customizable Mobile Agents · 10 authors 57 1
Submitted by zhangganlin 5 ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association · 4 authors 80 1
Submitted by quandao10 4 Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing · 9 authors 1
Submitted by amanchadha 4 SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction · 3 authors 1
Submitted by xianbao 4 Metis: Training Large Language Models with Advanced Low-Bit Quantization · 16 authors 1
Submitted by evanking 3 Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices · 5 authors 2.86k 1
Submitted by amanchadha 3 AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models · 8 authors 1
Submitted by kenantang 3 Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs · 6 authors 1
Submitted by taesiri 2 MedDINOv3: How to adapt vision foundation models for medical image segmentation? · 5 authors 34 1
Submitted by taesiri 2 Improving Large Vision and Language Models by Learning from a Panel of Peers · 5 authors 1
Submitted by theresiavr 2 Stairway to Fairness: Connecting Group and Individual Fairness · 5 authors 1 1
Submitted by zhengchong 2 FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models · 10 authors 32 1
Submitted by aHapBean 1 Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views · 3 authors 7 2
Submitted by Bekhouche 1 C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection · 6 authors 1