Skip to content
View gongel's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@PaddlePaddle

Block or report gongel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The agent that grows with you

Python 112,967 16,445 Updated Apr 23, 2026

Claude Code 泄露源码 - 本地可运行版本,新增跨平台桌面端软件补齐Computer Use(附带核心模块解析)

TypeScript 8,277 7,133 Updated Apr 22, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,111 541 Updated Apr 21, 2026

The open source coding agent.

TypeScript 148,304 16,970 Updated Apr 23, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,053 593 Updated Mar 13, 2026

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,782 307 Updated Apr 23, 2026

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 344 31 Updated Jan 26, 2026

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 92 6 Updated Jan 29, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,731 1,441 Updated Feb 27, 2026

[ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents

Python 58 2 Updated Apr 23, 2026

[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)

Python 965 51 Updated Apr 13, 2026

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 500 51 Updated Mar 30, 2026

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 209 70 Updated Apr 23, 2026

AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling persistent Skill memory for cross-task skill reuse and evolution.

TypeScript 8,590 763 Updated Apr 23, 2026

slime is an LLM post-training framework for RL Scaling.

Python 5,446 739 Updated Apr 23, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 77,883 15,991 Updated Apr 23, 2026
Python 62 5 Updated Jul 21, 2025

Scalable toolkit for efficient model reinforcement

Python 1,563 349 Updated Apr 23, 2026

Agentic RL Training at Scale

Python 1,317 267 Updated Apr 23, 2026

Muon is an optimizer for hidden layers in neural networks

Python 2,500 115 Updated Jan 19, 2026

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 744 44 Updated Jun 6, 2025

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 479 25 Updated May 17, 2025

Democratizing Reinforcement Learning for LLMs

Python 5,447 546 Updated Apr 23, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,175 699 Updated Apr 23, 2026

FireFlyer Record file format, writer and reader for DL training samples.

Python 246 25 Updated Dec 1, 2022

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,092 482 Updated Apr 23, 2026

Official Repo for Open-Reasoner-Zero

Python 2,091 119 Updated Jun 2, 2025
Python 219 9 Updated Feb 20, 2025
Python 1,134 53 Updated Jan 10, 2026
Next