World Models
The acquisition of structured, predictive representations of environment dynamics — the foundational substrate for planning, simulation, and embodied intelligence.
World models aim to learn internal representations of how the world works, enabling agents to predict future states, plan actions, and reason about counterfactuals. This pillar now integrates content from the Awesome 3D and 4D World Models survey, covering video, occupancy, and LiDAR generation paradigms. Explore the six sub-domains below.
Simulation & Driving
Foundation world models, interactive video generation, game simulation engines, and autonomous driving world models for synthetic data and planning.
Embodied Intelligence
Embodied AI for manipulation, navigation, and locomotion, plus vision-language-action models and model-based reinforcement learning approaches.
Video Generation
World modeling from video generation — data engines for multi-view street synthesis, action-conditioned prediction, neural simulators, and 4D scene reconstruction.
Occupancy Generation
3D/4D occupancy grids encoding geometry and semantics in voxel space — scene representation, occupancy forecasting, and autoregressive simulation.
LiDAR Generation
Leveraging point cloud sequences from LiDAR sensors to generate geometry-grounded scenes for safety-critical domains such as autonomous driving.
Theory, Benchmarks & Surveys
Theoretical foundations, evaluation benchmarks, driving datasets, workshops, and comprehensive survey literature for world models research.