-
20:21
(UTC +08:00)
Lists (2)
Sort Name ascending (A-Z)
Stars
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
Information collection for the Happy Horse AI video generator model. Official demo and updates at happyhorses.io.
Unified Codebase for Advanced World Models.
CLIP+MLP Aesthetic Score Predictor
Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models
Helios: Real Real-Time Long Video Generation Model
DreamWorld: Unified World Modeling in Video Generation
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
Video Content Customization Using First Frame
FireRed-Image-Edit is a powerful image editing foundation model achieving open-source state-of-the-art performance with precise instruction following, high-fidelity generation, superior identity co…
Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation"
[NeurIPS 2025 D&B🔥] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
[CVPR 2026 Highlight] VideoCoF: Unified Video Editing with Temporal Reasoner
[ICLR 2026] EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling
📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.
[NeurIPS 2025] Sekai: A Video Dataset towards World Exploration
[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
"MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation"
A curated list of recent diffusion models for video generation, editing, and various other applications.
[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Kandinsky 5.0: A family of diffusion models for Video & Image generation
HunyuanVideo-1.5: A leading lightweight video generation model
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720