luokai's picture

55 258

luokai

iamluokai

·

iamluokai

AI & ML interests

None yet

Recent Activity

liked a Space about 11 hours ago

Wan-AI/Wan2.2-Animate

liked a model 5 days ago

BadToBest/EchoMimicV3

liked a Space 8 days ago

IndexTeam/IndexTTS-2-Demo

View all activity

Organizations

upvoted a collection 18 days ago

MobileCLIP2

MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B • 37 items • Updated 1 day ago • 53

upvoted a collection 21 days ago

FastVLM

Efficient Vision Encoding for Vision Language Models • 9 items • Updated 17 days ago • 99

upvoted 2 papers about 1 month ago

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Paper • 2508.10881 • Published Aug 14 • 52

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 235

upvoted a paper about 2 months ago

EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion

Paper • 2507.16535 • Published Jul 22 • 20

upvoted a collection 2 months ago

Seed-X

A powerful open-source multilingual translation language model series, including instruction and reasoning models. • 8 items • Updated 29 days ago • 65

upvoted a paper 2 months ago

RoboBrain 2.0 Technical Report

Paper • 2507.02029 • Published Jul 2 • 31

upvoted a paper 3 months ago

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Paper • 2506.21416 • Published Jun 26 • 28

upvoted a collection 3 months ago

ERNIE 4.5

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated 11 days ago • 172

upvoted an article 3 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

By

and 1 other •

Jun 21

• 68

upvoted a collection 3 months ago

MedGemma Release

Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11 • 310

upvoted a collection 4 months ago

Qwen2.5-Omni

End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated Jul 21 • 158

upvoted 2 collections 5 months ago

Qwen3

84 items • Updated Aug 6 • 1.26k

InternVL3

34 items • Updated Apr 20 • 81

upvoted 2 papers 6 months ago

SkyReels-A2: Compose Anything in Video Diffusion Transformers

Paper • 2504.02436 • Published Apr 3 • 38

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14 • 143

upvoted a collection 6 months ago

Wan2.1 14B 480p I2V LoRAs

A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 49 items • Updated May 24 • 202

upvoted 2 collections 7 months ago

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 6 items • Updated Jul 23 • 130

DeepSeek R1 (All Versions)

DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 30 days ago • 258

upvoted a paper 7 months ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published Feb 14 • 55