Stars
Official repo for paper "HiMoE-VLA: Hierarchical Mixture-of-Experts for Generalist Vision-Language-Action Policies"
[CoRL 25] Code for FLOWER VLA for finetuning FLOWER on CALVIN and all LIBERO environments
[CVPR 2026] FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance
VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
A comprehensive list of excellent research papers, models, datasets, and other resources on Vision-Language-Action (VLA) models in robotics.
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.
We present StableAvatar, the first end-to-end video diffusion transformer, which synthesizes infinite-length high-quality audio-driven avatar videos without any post-processing, conditioned on a re…
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
4-steps distilled version of Wan2.2-TI2V-5B
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
[ICCV 2025] MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
[CSUR] A Survey on Video Diffusion Models
FNIN: A Fourier Neural Operator-based Numerical Integration Network for Surface-form-gradients
[CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
[Embodied-AI-Survey-2025] Paper List and Resource Repository for Embodied AI
This repository implements a Retrieval-Augmented Generation (RAG) system using FAISS for vector-based retrieval and GPT for generative response. It is designed to process large datasets, index them…
CVPR and NeurIPS poster examples and templates
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
[ACM MM 2023] Little Strokes Fell Great Oaks: Boosting the Hierarchical Features for Multi-exposure Image Fusion
A Collection of Papers and Codes for CVPR2026/CVPR2025/ICCV2025/CVPR2024/ECCV2024 AIGC