Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.21497

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 45
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 24

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

Ready_to_try_at_K22

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 30
MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published Mar 4 • 27
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 11 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 24

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Paper • 2505.19253 • Published May 25 • 29
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108
Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 148

Multimodal Agent

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25 • 28
Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 51

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 32
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 27
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 105

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 45
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 24

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 11 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 24

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

Ready_to_try_at_K22

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Paper • 2505.19253 • Published May 25 • 29
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108
Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 148

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

Multimodal Agent

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25 • 28
Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 51

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 30
MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published Mar 4 • 27
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 32
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 27
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 105

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略