Skip to content
View axrshz's full-sized avatar
  • India

Block or report axrshz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated list of papers, tools, and resources on Multi-Token Prediction (MTP) and related techniques in Large Language Models (LLMs), Speech-Language Models (SLMs), and more.

82 3 Updated Feb 7, 2026

The best ChatGPT that $100 can buy.

Python 33 Updated Apr 23, 2026

Train transformer language models with reinforcement learning.

Python 18,161 2,672 Updated Apr 24, 2026

A Quirky Assortment of CuTe Kernels

Python 946 119 Updated Apr 24, 2026

A curated list of awesome Go frameworks, libraries and software

Go 170,923 13,176 Updated Apr 23, 2026

A Curated List of Vision-Language-Action (VLA) and World Action Models (WAM) Research and Beyond

328 4 Updated Apr 23, 2026

Puffing up reinforcement learning

C 5,620 443 Updated Apr 23, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 26,395 5,543 Updated Apr 25, 2026

FrontierSWE is an ultra long-horizon coding agent benchmark that tests implementation, performance eng and ML research

C 84 3 Updated Apr 23, 2026
Python 1 Updated Apr 16, 2026

Reference code for the Meta-Harness paper.

Python 643 48 Updated Apr 16, 2026

🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models

Python 182 15 Updated Apr 23, 2026

250+ Fine-tuning & RL Notebooks for text, vision, audio, embedding, TTS models.

Jupyter Notebook 5,272 860 Updated Apr 24, 2026

Competitive GPU kernel optimization platform.

TypeScript 189 19 Updated Apr 24, 2026

🚀 Efficient implementations for emerging model architectures

Python 4,975 510 Updated Apr 22, 2026

Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM

C++ 115 7 Updated Apr 14, 2026
C++ 3 Updated Jan 17, 2026

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,124 111 Updated Dec 30, 2024

A curated list of academic papers and resources on Physical AI — focusing on Vision-Language-Action (VLA) models, world models, embodied ai, and robotic foundation models.

219 20 Updated Mar 30, 2026

A PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with clamping and residual connection, Mixture-of-Experts (MoE), Sel…

Python 231 16 Updated Dec 2, 2025

Large Language Model (LLM) Systems Paper List

1,937 101 Updated Apr 17, 2026

📚 A curated list of awesome articles, videos, and other resources to learn and practice software architecture, patterns, and principles.

C# 10,980 954 Updated Feb 1, 2026

A curated list to learn about distributed systems

11,785 1,538 Updated Jan 10, 2025

rvLLM: High-performance LLM inference in Rust. Drop-in vLLM replacement.

Rust 685 63 Updated Apr 25, 2026

Solve puzzles. Learn CUDA.

Jupyter Notebook 12,065 931 Updated Sep 1, 2024

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

70,577 6,974 Updated Jan 4, 2026

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…

1,559 51 Updated Apr 24, 2026

GPU Engineering for AI Systems

HTML 299 35 Updated Apr 21, 2026

Curated collection of AI inference engineering resources — LLM serving, GPU kernels, quantization, distributed inference, and production deployment. Compiled from the AER Labs community.

101 9 Updated Feb 4, 2026
Next