Submitted by learn3r 114 WebSailor: Navigating Super-human Reasoning for Web Agent · 19 authors 6.5k 4
Submitted by Warrieryes 86 Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers · 15 authors 913 3
Submitted by amanchadha 65 Energy-Based Transformers are Scalable Learners and Thinkers · 10 authors 475 15
Submitted by Liuff23 60 LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion · 7 authors 278 1
Submitted by chrisliu298 53 Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy · 12 authors 106 7
Submitted by ai-alanov 39 Heeding the Inner Voice: Aligning ControlNet Training via Intermediate Features Feedback · 4 authors 28 1
Submitted by siqisun 35 IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction · 7 authors 140 5
Submitted by jinjiajie 25 Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search · 8 authors 2
Submitted by yilunzhao 18 Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers · 5 authors 1
Submitted by hba123 14 Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving · 6 authors 1
Submitted by SivilTaram 10 ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention · 9 authors 1
Submitted by kenhktsui 9 Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs · 1 authors 2 3
Submitted by Facico 7 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models · 3 authors 4 1
Submitted by JJ-TMT 5 AsyncFlow: An Asynchronous Streaming RL Framework for Efficient LLM Post-Training · 19 authors 1
Submitted by yxl66666 2 CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation · 8 authors 21 1
Submitted by SivanSX 2 HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation · 6 authors 1