- Cambridge, UK
-
16:21
(UTC +01:00) - https://woonyee.org/
Starred repositories
Causal depthwise conv1d in CUDA, with a PyTorch interface
Summary of some awesome work for optimizing LLM inference
A powerful AI coding agent. Built for the terminal.
LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, DEVIN, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐
🌱 a fast, batteries-included static-site generator that transforms Markdown content into fully functional websites
PyTorch code and models for VJEPA2 self-supervised learning from video.
Daily Arxiv Papers on LLM Systems
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A free open source RAG based AI legal assistant.
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure
A Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
List of Computer Science courses with video lectures.
Development repository for the Triton language and compiler
FlashMLA: Efficient Multi-head Latent Attention Kernels
SGLang is a high-performance serving framework for large language models and multimodal models.
Bayesian optimisation & Reinforcement Learning library developed by Huawei Noah's Ark Lab
Original source code for Modern C++ for Absolute Beginners 2nd ed.
Basic Development Environment - a set of foundational C++ libraries used at Bloomberg.
Personal Notes for Learning HPC & Parallel Computation [NO LONGER ADDING NEW CONTENT]