-
Tsinghua University
- Zhejiang, China
-
18:00
(UTC +08:00) - https://shzh.wiki
Lists (8)
Sort Name ascending (A-Z)
Inference
Deep learning inference engine.Kernel
High performance kernel libraries.LLM
Large Language Model related resources.ML compiler
Machine Learning Compilers related resources.Quant
Quantization kernel libraries.SD
Stable Diffusion related resources.Serving
Large model inference serving framework.Training
Deep learning training libraries.Stars
A framework for efficient model inference with omni-modality models
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
Perplexity open source garden for inference technology
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
slime is an LLM post-training framework for RL Scaling.
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
My learning notes for ML SYS.
Optimized primitives for collective multi-GPU communication
A PyTorch native platform for training generative AI models
What would you do with 1000 H100s...
Efficient Triton Kernels for LLM Training
Puzzles for learning Triton, play it with minimal environment configuration!
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
CUDA Python: Performance meets Productivity
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
FlashInfer: Kernel Library for LLM Serving
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Virtual whiteboard for sketching hand-drawn like diagrams
Tile primitives for speedy kernels
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch