-
Microsoft Research
- Beijing,China
- https://pengzhiliang.github.io
Stars
Open-source orchestration for zero-human companies
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
omo; the best agent harness - previously oh-my-opencode
A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
A ComfyUI custom node integration for local multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Echo-TTS, Qwen3-TTS, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterb…
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…
Production-ready platform for agentic workflow development.
[NeurIPS 2025] YOLOv12: Attention-Centric Real-Time Object Detectors
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
MoVQGAN - model for the image encoding and reconstruction
Vector (and Scalar) Quantization, in Pytorch
🔊 Text-Prompted Generative Audio Model
unofficial implementation of the High Fidelity Neural Audio Compression
A repository for research on medium sized language models.
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
ImageBind One Embedding Space to Bind Them All
Fast and memory-efficient exact attention