Defining and Evaluating Visual Language Models' Basic Spatial Abilities: A Perspective from Psychometrics Paper • 2502.11859 • Published Feb 17
Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations Paper • 2506.04633 • Published Jun 5 • 19
PulseCheck457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models Paper • 2502.08636 • Published Feb 12
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision Paper • 2506.06253 • Published Jun 6 • 9
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges Paper • 2505.04769 • Published May 7 • 9
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published Jan 16 • 41
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30 • 86
AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance Paper • 2508.06944 • Published about 1 month ago • 2