| Omni-Video 2 | Omni-Video 2: Scaling MLLM-Conditioned Diffusion for Unified Video Generation and Editing | arXiv | 2026 |
| Factorized VidGen | Factorized Video Generation: Decoupling Scene Construction and Temporal Synthesis | arXiv | 2025 |
| Wan 2.1 | Wan 2.1: Advancing Video Generation with Scalable Diffusion Transformers | Alibaba | 2025 |
| HunyuanVideo | HunyuanVideo: A Systematic Framework for Large Video Generative Models | Tencent | 2025 |
| Step-Video-T2V | Step-Video-T2V: A State-of-the-Art Text-to-Video Generation Model | StepFun | 2025 |
| CogVideoX-5B | CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer | Zhipu AI | 2025 |
| Veo 2 | Veo 2: Photorealistic Video Generation | Google DeepMind | 2025 |
| Causal Forcing | Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation | arXiv | 2026 |
| MAGI-1 | Autoregressive Video Generation at Scale | Sand AI | 2025 |
| Seaweed-7B | Cost-Effective Training of Video Generation Foundation Model | ByteDance | 2025 |
| Magic 1-For-1 | Generating One Minute Video Clips within One Minute | arXiv | 2025 |
| Lumina-Video | Efficient and Flexible Video Generation with Multi-scale Next-DiT | arXiv | 2025 |
| RepVideo | Rethinking Cross-Layer Representation for Video Generation | arXiv | 2025 |
| M4V | Multi-Modal Mamba for Text-to-Video Generation | arXiv | 2025 |
| RIFLEx | A Free Lunch for Length Extrapolation in Video Diffusion Transformers | arXiv | 2025 |
| Movie Gen | A Cast of Media Foundation Models | Meta | 2024 |
| Sora | Video Generation Models as World Simulators | OpenAI | 2024 |
| Vidu | Highly Consistent Text-to-Video Generator with Diffusion Models | Shengshu | 2024 |
| Snap Video | Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | Snap Inc | 2024 |
| Latte | Latent Diffusion Transformer for Video Generation | arXiv | 2024 |
| GenTron | Delving Deep into Diffusion Transformers for Image and Video Generation | CVPR | 2024 |
| Lumiere | A Space-Time Diffusion Model for Video Generation | Google | 2024 |
| MagicVideo-V2 | Multi-Stage High-Aesthetic Video Generation | ByteDance | 2024 |
| VideoPoet | A Large Language Model for Zero-Shot Video Generation | Google | 2023 |
| Photorealistic Video Generation | Photorealistic Video Generation with Diffusion Models | Google | 2023 |
| EasyAnimate | EasyAnimate: An End-to-End Solution for High-Resolution and Long Video Generation | Alibaba | 2024 |
| VideoCrafter2 | VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | CVPR | 2024 |
| AnimateLCM | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models with Decoupled Consistency Learning | arXiv | 2024 |
| Open-Sora 2.0 | Open-Sora 2.0: Commercial-Level Video Generation on a Budget | HPC-AI Tech | 2025 |
| StreamDiT | StreamDiT: Streaming Video Generation with Diffusion Transformers | arXiv | 2025 |
| Seedance 1.0 | Seedance 1.0: Scalable Dance and Motion Video Generation | ByteDance | 2025 |
| GameGen-X | GameGen-X: Interactive Open-world Game Video Generation | ICLR | 2025 |