WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14 • 44
🔱 Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs • 34 items • Updated Nov 19 • 30
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4 • 101
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback Paper • 2511.01678 • Published Nov 3 • 35
From Charts to Code: A Hierarchical Benchmark for Multimodal Models Paper • 2510.17932 • Published Oct 20 • 7
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6 • 118
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper • 2504.06148 • Published Apr 8 • 13
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper • 2504.06148 • Published Apr 8 • 13
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper • 2504.06148 • Published Apr 8 • 13 • 2
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper • 2503.20198 • Published Mar 26 • 4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper • 2503.20198 • Published Mar 26 • 4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper • 2503.20198 • Published Mar 26 • 4 • 3