gongel

🎯

Focusing

gongel gongel

🎯

Focusing

77 followers · 54 following

Beijing

Achievements

x2 x3

Achievements

x2 x3

Organizations

Lists (1)

Sort

🔮 Future ideas

Stars

NousResearch / hermes-agent

The agent that grows with you

Python 112,967 16,445 Updated Apr 23, 2026

NanmiCoder / cc-haha

Claude Code 泄露源码 - 本地可运行版本，新增跨平台桌面端软件补齐Computer Use（附带核心模块解析）

TypeScript 8,277 7,133 Updated Apr 22, 2026

Gen-Verse / OpenClaw-RL

OpenClaw-RL: Train any agent simply by talking

Python 5,111 541 Updated Apr 21, 2026

anomalyco / opencode

The open source coding agent.

TypeScript 148,304 16,970 Updated Apr 23, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,053 593 Updated Mar 13, 2026

NovaSky-AI / SkyRL

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,782 307 Updated Apr 23, 2026

AMAP-ML / Tree-GRPO

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 344 31 Updated Jan 26, 2026

WooooDyy / BAPO

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 92 6 Updated Jan 29, 2026

Alibaba-NLP / DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,731 1,441 Updated Feb 27, 2026

GuoqingWang1 / IGPO

[ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents

Python 58 2 Updated Apr 23, 2026

RUC-NLPIR / ARPO

[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)

Python 965 51 Updated Apr 13, 2026

NVlabs / QeRL

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 500 51 Updated Mar 30, 2026

ISEEKYAN / mbridge

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 209 70 Updated Apr 23, 2026

MemTensor / MemOS

AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling persistent Skill memory for cross-task skill reuse and evolution.

TypeScript 8,590 763 Updated Apr 23, 2026

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 5,446 739 Updated Apr 23, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 77,883 15,991 Updated Apr 23, 2026

hkgc-1 / GHPO

Python 62 5 Updated Jul 21, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,563 349 Updated Apr 23, 2026

PrimeIntellect-ai / prime-rl

Agentic RL Training at Scale

Python 1,317 267 Updated Apr 23, 2026

KellerJordan / Muon

Muon is an optimizer for hidden layers in neural networks

Python 2,500 115 Updated Jan 19, 2026

SkyworkAI / Skywork-OR1

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 744 44 Updated Jun 6, 2025

QwenLM / ParScale

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 479 25 Updated May 17, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 5,447 546 Updated Apr 23, 2026

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,175 699 Updated Apr 23, 2026

HFAiLab / ffrecord

FireFlyer Record file format, writer and reader for DL training samples.

Python 246 25 Updated Dec 1, 2022

inclusionAI / AReaL

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,092 482 Updated Apr 23, 2026

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 2,091 119 Updated Jun 2, 2025

GAIR-NLP / LIMR

Python 219 9 Updated Feb 20, 2025

huggingface / Math-Verify

Python 1,134 53 Updated Jan 10, 2026

GuanghaoYe / Emergence-of-Thinking

Forked from OpenRLHF/OpenRLHF

Python 53 4 Updated Feb 11, 2025