Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.15144

mPLUG/GUI-Owl-7B

8B • Updated 18 days ago • 773 • 35
mPLUG/GUI-Owl-32B

33B • Updated 18 days ago • 331 • 19
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 24

Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61

UI-Venus Technical Report: Building High-performance UI Agents with RFT

Paper • 2508.10833 • Published 25 days ago • 41
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61

SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

Paper • 2506.04180 • Published Jun 4 • 33
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Paper • 2506.10540 • Published Jun 12 • 38
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Paper • 2506.10974 • Published Jun 12 • 19
SPAR: Scholar Paper Retrieval with LLM-based Agents for Enhanced Academic Search

Paper • 2507.15245 • Published Jul 21 • 11

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics

End-to-End Goal-Driven Web Navigation

Paper • 1602.02261 • Published Feb 6, 2016
Learning Language Games through Interaction

Paper • 1606.02447 • Published Jun 8, 2016
Naturalizing a Programming Language via Interactive Learning

Paper • 1704.06956 • Published Apr 23, 2017
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018 • 1

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 52
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 287
Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 242
DINOv3

Paper • 2508.10104 • Published 26 days ago • 241

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

Paper • 2508.15760 • Published 18 days ago • 44
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published 11 days ago • 98

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 131
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 122
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published 18 days ago • 135

Multimodal Agent

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25 • 28
Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 51

mPLUG/GUI-Owl-7B

8B • Updated 18 days ago • 773 • 35
mPLUG/GUI-Owl-32B

33B • Updated 18 days ago • 331 • 19
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics

End-to-End Goal-Driven Web Navigation

Paper • 1602.02261 • Published Feb 6, 2016
Learning Language Games through Interaction

Paper • 1606.02447 • Published Jun 8, 2016
Naturalizing a Programming Language via Interactive Learning

Paper • 1704.06956 • Published Apr 23, 2017
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018 • 1

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 24

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 52
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 287
Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 242
DINOv3

Paper • 2508.10104 • Published 26 days ago • 241

Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

Paper • 2508.15760 • Published 18 days ago • 44
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published 11 days ago • 98

UI-Venus Technical Report: Building High-performance UI Agents with RFT

Paper • 2508.10833 • Published 25 days ago • 41
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 131
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 122
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published 19 days ago • 61
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published 18 days ago • 135

SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

Paper • 2506.04180 • Published Jun 4 • 33
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Paper • 2506.10540 • Published Jun 12 • 38
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Paper • 2506.10974 • Published Jun 12 • 19
SPAR: Scholar Paper Retrieval with LLM-based Agents for Enhanced Academic Search

Paper • 2507.15245 • Published Jul 21 • 11

Multimodal Agent

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25 • 28
Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 51

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略