KABI's picture

KABI

dongguanting

·

https://dongguanting.github.io/

AI & ML interests

Reasoning and Alignment for Large Language Models

Recent Activity

new activity about 17 hours ago

dongguanting/Tool-Star-Qwen-1.5B:Update README.md

upvoted a paper about 18 hours ago

AgentEvolver: Towards Efficient Self-Evolving Agent System

upvoted a paper about 21 hours ago

MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning

View all activity

Organizations

Collections 4

View 4 collections

Papers 43

arxiv:2602.22897

arxiv:2601.06860

arxiv:2601.05808

arxiv:2601.04888

models 16

dongguanting/Tool-Star-Qwen-1.5B

Text Generation • 2B • Updated about 17 hours ago • 8 • 2

dongguanting/Qwen3-8B-AEPO-DeepSearch

Text Generation • 8B • Updated Dec 20, 2025 • 2 • 2

dongguanting/QwQ-32B-AEPO-DeepSearch

Text Generation • 33B • Updated Dec 20, 2025 • 2 • 1

dongguanting/QwQ-32B-ARPO-DeepSearch

33B • Updated Dec 20, 2025 • 1

dongguanting/aepo_light

8B • Updated Nov 3, 2025 • 1

dongguanting/Qwen2.5-7B-AEPO

Text Generation • 8B • Updated Oct 27, 2025 • 3 • 1

dongguanting/Qwen3-14B-AEPO-DeepSearch

Robotics • 15B • Updated Oct 21, 2025 • 4 • 1

dongguanting/Qwen2.5-7B-ARPO

Text Generation • 8B • Updated Aug 19, 2025 • 2 • 2

dongguanting/Llama3.1-8B-ARPO

Text Generation • 8B • Updated Aug 12, 2025 • 2 • 1

dongguanting/Qwen2.5-3B-ARPO

Text Generation • 3B • Updated Aug 12, 2025 • 7 • 3

datasets 11

dongguanting/ARPO-RL-DeepSearch-1K

Viewer • Updated Oct 17, 2025 • 1.07k • 219 • 6

dongguanting/ARPO-RL-Reasoning-10K

Viewer • Updated Oct 17, 2025 • 10k • 209 • 4

dongguanting/ARPO-SFT-54K

Viewer • Updated Oct 17, 2025 • 54.6k • 256 • 14

dongguanting/RAG-Error-Critic-100K

Viewer • Updated Jun 28, 2025 • 100k • 18 • 3

dongguanting/Tool-Star-SFT-54K

Viewer • Updated May 29, 2025 • 54k • 55 • 10

dongguanting/Multi-Tool-RL-10K

Viewer • Updated May 25, 2025 • 10k • 80 • 5

dongguanting/RAG-QA-40K

Viewer • Updated Dec 27, 2024 • 32.8k • 65 • 2

dongguanting/ShareGPT-12K

Viewer • Updated Dec 27, 2024 • 12.9k • 110 • 1

dongguanting/VIF-RAG-QA-110K

Viewer • Updated Dec 27, 2024 • 111k • 56 • 7

dongguanting/DotamathQA

Viewer • Updated Dec 26, 2024 • 574k • 82 • 2

View 11 datasets