-
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
Paper • 2507.02608 • Published • 21 -
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Paper • 2503.10631 • Published -
Mobile Video Diffusion
Paper • 2412.07583 • Published • 20 -
ObjFiller-3D: Consistent Multi-view 3D Inpainting via Video Diffusion Models
Paper • 2508.18271 • Published • 7
Stoney Kang
sikang99
AI & ML interests
Remote Control based on Vision
Recent Activity
upvoted
a
paper
2 days ago
From Editor to Dense Geometry Estimator
upvoted
a
paper
3 days ago
Planning with Reasoning using Vision Language World Model
upvoted
a
paper
3 days ago
Robix: A Unified Model for Robot Interaction, Reasoning and Planning