Skip to content
View izmttk's full-sized avatar
🌟
sparkle up the world
🌟
sparkle up the world

Block or report izmttk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,079 601 Updated Mar 13, 2026

Lightweight LLM inference engine inspired by nano-vllm, with radix-tree based prefix cache, tp & pp, cuda graph, openai api, async scheduling, and more.

Python 9 Updated Mar 29, 2026

GPU documentation for humans

Python 573 73 Updated Mar 24, 2026

Offline optimization of your disaggregated Dynamo graph

Python 278 107 Updated Apr 29, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,531 942 Updated Apr 28, 2026

Fast and memory-efficient exact attention

Python 23,587 2,665 Updated Apr 29, 2026

Nano vLLM

Python 13,178 2,014 Updated Apr 26, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 78,576 16,247 Updated Apr 29, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 26,697 5,619 Updated Apr 29, 2026

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 16,219 1,594 Updated Mar 4, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,985 287 Updated May 15, 2025

GPU programming related news and material links

2,115 126 Updated Mar 8, 2026

⭐ A simple, fast and powerful blog & document theme built by Astro

Astro 912 252 Updated Apr 21, 2026

Material for gpu-mode lectures

Jupyter Notebook 6,028 607 Updated Apr 22, 2026

collection of benchmarks to measure basic GPU capabilities

C++ 516 83 Updated Oct 24, 2025

Open-source Windows and Office activator featuring HWID, Ohook, TSforge, and Online KMS activation methods, along with advanced troubleshooting.

Batchfile 173,542 16,669 Updated Apr 17, 2026

Markdown can be used for posting Moments and Docs on my Astro-based site.

Astro 2 Updated Mar 29, 2026

An React.js component library for beautifully shaded canvas https://uvcanvas.com

TypeScript 1,365 52 Updated Mar 28, 2025

为我(曾经)不存在的赛博女儿制作的 Astro 静态博客模板

Astro 143 17 Updated Apr 25, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,185 371 Updated Apr 20, 2026

A text marking & annotation engine for presenting source code on the web.

TypeScript 929 41 Updated Apr 21, 2026

Low-JavaScript embed components for Astro websites

Astro 389 53 Updated Apr 19, 2026

✨A static blog template built with Astro.

Astro 4,485 1,187 Updated Mar 10, 2026

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 10,701 830 Updated Dec 4, 2024

强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 14,093 2,248 Updated Dec 30, 2025

My personal blog built with Astro, React and Tailwindcss.

TypeScript 74 12 Updated Mar 28, 2025

Awesome LLM compression research papers and tools.

1,826 126 Updated Feb 23, 2026

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 110,580 12,900 Updated Apr 29, 2026

A curated list of awesome edge computing, including Frameworks, Simulators, Tools, etc.

501 84 Updated Apr 24, 2026

A markup-based typesetting system that is powerful and easy to learn.

Rust 53,169 1,554 Updated Apr 29, 2026
Next