| Pixel-to-4D | Pixel-to-4D: Camera-Controlled Image-to-Video Generation with Dynamic 3D Gaussians | arXiv | 2026 |
| Veo 2 | Veo 2: State-of-the-Art Video Generation with Google DeepMind | Google DeepMind | 2025 |
| Kling 1.6 | Kling 1.6: Advanced AI Video Generation Model | Kuaishou | 2025 |
| Pika 2.0 | Pika 2.0: Next-Generation AI Video Generator | Pika Labs | 2025 |
| Runway Gen-3 Alpha | Gen-3 Alpha: A New Frontier for Video Generation Models | Runway | 2024 |
| Luma Dream Machine | Dream Machine: AI Model That Makes High Quality Videos from Text and Images | Luma AI | 2024 |
| Jimeng | Jimeng: Image-to-Video Generation with Diffusion Transformers | ByteDance | 2025 |
| Stable Video Diffusion | Scaling Latent Video Diffusion Models to Large Datasets | Stability AI | 2023 |
| DynamiCrafter | Animating Open-domain Images with Video Diffusion Priors | CUHK | 2023 |
| I2VGen-XL | High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models | Alibaba | 2023 |
| PIA | Personalized Image Animator via Plug-and-Play Modules in T2I Models | arXiv | 2023 |
| AnimateDiff | Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning | ICLR | 2024 |
| ConsistI2V | Enhancing Visual Consistency for Image-to-Video Generation | arXiv | 2024 |
| TI2V-Zero | Zero-Shot Image Conditioning for Text-to-Video Diffusion Models | CVPR | 2024 |
| MagicTime | Time-lapse Video Generation Models as Metamorphic Simulators | arXiv | 2024 |
| TRIP | Temporal Residual Learning with Image Noise Prior for I2V Diffusion Models | CVPR | 2024 |
| StoryDiffusion | Consistent Self-Attention for Long-Range Image and Video Generation | arXiv | 2024 |
| Video-LaVIT | Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization | arXiv | 2024 |
| Cinemo | Consistent and Controllable Image Animation with Motion Diffusion Models | arXiv | 2024 |
| I2V-Adapter | A General Image-to-Video Adapter for Video Diffusion Models | arXiv | 2023 |
| MotiF | Making Text Count in Image Animation with Motion Focal Loss | arXiv | 2024 |
| DLFR-VAE | Dynamic Latent Frame Rate VAE for Video Generation | arXiv | 2025 |
| Packing Input Frame Context | Next-Frame Prediction Models for Video Generation | arXiv | 2025 |
| Step-Video-TI2V | State-of-the-Art Text-Driven Image-to-Video Generation Model | arXiv | 2025 |
| SparseCtrl | SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models | arXiv | 2024 |
| LivePhoto | LivePhoto: Real Image Animation with Text-Guided Motion Control | arXiv | 2024 |
| ToonCrafter | ToonCrafter: Generative Cartoon Interpolation | arXiv | 2024 |
| Follow-Your-Click | Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts | arXiv | 2024 |
| FrameBridge | FrameBridge: Improving Image-to-Video Generation with Bridge Models | ICLR | 2025 |
| DFoT | History-Guided Video Diffusion: Diffusion Forcing Transformer for Variable-Length Conditioning | arXiv | 2025 |
| CogVideoX-I2V | CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer for I2V | ICLR | 2025 |
| Wan-I2V | Wan: Open and Advanced Large-Scale Image-to-Video Generative Models | Alibaba | 2025 |
| HunyuanVideo-I2V | HunyuanVideo: Image-to-Video Generation with Systematic Framework | Tencent | 2025 |
| EasyAnimate-I2V | EasyAnimate: An End-to-End Solution for Image-to-Video Generation | Alibaba | 2024 |
| ALIVE | ALIVE: Animate Your World with Lifelike Audio-Video Generation | arXiv | 2026 |