Collections
Discover the best community collections!
Collections including paper arxiv:2508.03012
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 272 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 67 -
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning
Paper • 2508.03501 • Published • 55 -
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Paper • 2508.04700 • Published • 51 -
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems
Paper • 2508.01415 • Published • 7
-
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper • 2502.18418 • Published • 28 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
Paper • 2505.16967 • Published • 24 -
SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension
Paper • 2508.01959 • Published • 57
-
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 67 -
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning
Paper • 2508.03501 • Published • 55 -
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Paper • 2508.04700 • Published • 51 -
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems
Paper • 2508.01415 • Published • 7
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 272 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper • 2502.18418 • Published • 28 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
Paper • 2505.16967 • Published • 24 -
SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension
Paper • 2508.01959 • Published • 57