Starred repositories
OpenClaw-RL: Train any agent simply by talking
The official repository of "A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications".
Self-referential self-improving agents that can optimize for any computable task
Curated academic CV templates and guidelines for PhD students, researchers, and faculty job applicants.
AI agents running research on single-GPU nanochat training automatically
LLM Chess - evaluating Large Language Models' reasoning and instruction-following abilities by simulating chess games
A collection of various llm pruning implementations, training code for GPUs & TPUs, and evaluation script.
CATArena is an engineering-level tournament evaluation platform for Large Language Model-driven code agents (LLM-driven code agents), based on an iterative competitive peer learning framework.
"AI-Trader: 100% Fully-Automated Agent-Native Trading"
Synthetic data curation for post-training and structured data extraction
Benchmark LLM reasoning capability by solving chess puzzles.
Training VLM agents with multi-turn reinforcement learning
Harsh Jhamtani*, Varun Gangal*, Eduard Hovy, Graham Neubig, Taylor Berg-Kirkpatrick. Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data. ACL 2018
Open source neural network chess engine with GPU acceleration and broad hardware support.
A Text-Based Environment for Interactive Debugging
This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language Models".
Fully open reproduction of DeepSeek-R1
[ICLR 2026] Learning to Reason without External Rewards
A library for generative social simulation
AI paper trading project inspired by nof1 Alpha Arena, using cctx for quotation.
Procgen Benchmark: Procedurally-Generated Game-Like Gym-Environments
Defeating the Training-Inference Mismatch via FP16
Natural Language Reinforcement Learning
Post-training with Tinker
A library for mechanistic interpretability of GPT-style language models
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
[ICLR 2026] Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.