| Scaling World Model | Scaling World Model for Hierarchical Manipulation Policies | Manipulation |
| Say, Dream, and Act | Say, Dream, and Act: Learning Video World Models for Instruction-Driven Robot Manipulation | Manipulation |
| World-VLA-Loop | World-VLA-Loop: Closed-Loop Learning of Video World Model and VLA Policy | Manipulation |
| RISE | RISE: Self-Improving Robot Policy with Compositional World Model | Manipulation |
| VLAW | VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model | Manipulation |
| GigaBrain-0.5M* | GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning | VLA |
| JEPA-VLA | JEPA-VLA: Video Predictive Embedding is Needed for VLA Models | VLA |
| VLA-JEPA | VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model | VLA |
| DreamDojo | DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos | Manipulation |
| DDP-WM | DDP-WM: Disentangled Dynamics Prediction for Efficient World Models | Manipulation |
| LingBot-VA | LingBot-VA: Causal World Modeling for Robot Control | Manipulation |
| LingBot-VLA | LingBot-VLA: A Pragmatic VLA Foundation Model | Manipulation |
| PointWorld | PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation | Manipulation |
| Envision | Envision: Embodied Visual Planning via Goal-Imagery Video Diffusion | Manipulation |
| Large Video Planner | Large Video Planner Enables Generalizable Robot Control | Manipulation |
| Act2Goal | Act2Goal: From World Model To General Goal-conditioned Policy | Manipulation |
| RoboReward | RoboReward: General-Purpose Vision-Language Reward Models for Robotics | Manipulation |
| π0 | π0: A Vision-Language-Action Flow Model for General Robot Control | Foundation |
| Octo | Octo: An Open-Source Generalist Robot Policy | Foundation |
| OpenVLA | OpenVLA: An Open-Source Vision-Language-Action Model | Foundation |
| RDT-1B | RDT-1B: A Diffusion Foundation Model for Bimanual Manipulation | Manipulation |
| TesserAct | Learning 4D Embodied World Models | Foundation |
| DreamGen | Unlocking Generalization in Robot Learning through Video World Models | Foundation |
| iVideoGPT | Interactive VideoGPTs are Scalable World Models | Foundation |
| AgiBot-World | Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems | Manipulation |
| FLARE | Robot Learning with Implicit World Modeling | Manipulation |
| EnerVerse | Envisioning Embodied Future Space for Robotics Manipulation | Manipulation |
| NWM | Navigation World Models | Navigation |
| MindJourney | Test-Time Scaling with World Models for Spatial Reasoning | Navigation |
| DWL | Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning | Locomotion |
| Puppeteer | Hierarchical World Models as Visual Whole-Body Humanoid Controllers | Locomotion |