dblate

Follow

🐢

AI Infrastructure

yuhui dblate

🐢

AI Infrastructure

Follow

若一去不回，便一去不回

21 followers · 110 following

Baidu
Beijing, China

Achievements

Achievements

Stars

zai-org / GLM-5

GLM-5: From Vibe Coding to Agentic Engineering

3,050 325 Updated Apr 17, 2026

datawhalechina / easy-rl

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 14,067 2,247 Updated Dec 30, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,627 1,811 Updated Apr 21, 2026

vllm-project / guidellm

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 1,062 146 Updated Apr 22, 2026

VoltAgent / awesome-openclaw-skills

The awesome collection of OpenClaw skills. 5,400+ skills filtered and categorized from the official OpenClaw Skills Registry.🦞

47,081 4,625 Updated Apr 20, 2026

web-infra-dev / midscene

AI-powered, vision-driven UI automation for every platform.

TypeScript 12,784 952 Updated Apr 24, 2026

Mofangbao / midscene-pc

Midscene connector for pc,include local pc and remote pc server. Supports windows/linux/macos.基于midscene的跨平台PC桌面端自动化操作代理。同时支持本地和远程桌面操作方式。

TypeScript 44 6 Updated Jan 27, 2026

Psi-Robot / Awesome-VLA-Papers

Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

521 19 Updated Jul 3, 2025

jjiantong / Awesome-KV-Cache-Optimization

[ACL 2026] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization

Python 266 14 Updated Apr 21, 2026

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 5,447 546 Updated Apr 23, 2026

Hsword / SpotServe

SpotServe: Serving Generative Large Language Models on Preemptible Instances

134 16 Updated Feb 22, 2024

ModelEngine-Group / unified-cache-management

Persist and reuse KV Cache to speedup your LLM.

Python 272 73 Updated Apr 24, 2026

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 24,106 2,770 Updated Mar 12, 2026

thu-ml / TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,470 251 Updated Apr 15, 2026

ai-dynamo / aiconfigurator

Offline optimization of your disaggregated Dynamo graph

Python 274 104 Updated Apr 24, 2026

GeeeekExplorer / nano-vllm

Nano vLLM

Python 13,105 1,988 Updated Apr 13, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,054 594 Updated Mar 13, 2026

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,665 517 Updated Apr 23, 2026

tintn / vision-transformer-from-scratch

A Simplified PyTorch Implementation of Vision Transformer (ViT)

Jupyter Notebook 252 43 Updated Jun 10, 2024

InfiniTensor / InfiniLM

C++ 58 61 Updated Apr 24, 2026

InfiniTensor / llaisys

Let's Learn AI SYStem

Python 40 240 Updated Jan 30, 2026

sgl-project / SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 799 214 Updated Apr 2, 2026

ome-projects / ome

Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton

Go 431 74 Updated Apr 23, 2026

gpu-mode / Triton-Puzzles

Puzzles for learning Triton

Jupyter Notebook 2,397 220 Updated Apr 1, 2026

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 52,430 6,986 Updated Apr 14, 2026

lmcinnes / umap

Uniform Manifold Approximation and Projection

Python 8,157 863 Updated Apr 18, 2026

sgl-project / rbg

A workload for deploying LLM inference services on Kubernetes

Go 208 54 Updated Apr 24, 2026

MoonshotAI / checkpoint-engine

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 943 83 Updated Feb 28, 2026

galeselee / Awesome_LLM_System-PaperList

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…

281 15 Updated Mar 6, 2025

James-QiuHaoran / LLM-serving-with-proxy-models

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an LLM (with low latency overhead!)

Jupyter Notebook 50 8 Updated Jun 1, 2024