Skip to content
View gongel's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@PaddlePaddle

Block or report gongel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The agent that grows with you

Python 135,798 20,801 Updated May 6, 2026

Claude Code 泄露源码 - 本地可运行版本,新增跨平台桌面端软件补齐Computer Use(附带核心模块解析)

TypeScript 9,773 7,614 Updated May 5, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,238 563 Updated Apr 30, 2026

The open source coding agent.

TypeScript 155,841 18,070 Updated May 6, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,121 617 Updated Mar 13, 2026

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,809 315 Updated May 6, 2026

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 349 33 Updated Jan 26, 2026

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 92 6 Updated Jan 29, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,802 1,447 Updated Feb 27, 2026

[ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents

Python 61 3 Updated Apr 23, 2026

[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)

Python 976 51 Updated Apr 13, 2026

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 502 51 Updated Mar 30, 2026

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 211 71 Updated May 6, 2026

Self-evolving memory OS for LLM & AI Agents: ultra-persistent memory, hybrid-retrieval, and cross-task skill reuse, with 35.24% token savings

TypeScript 8,933 795 Updated May 6, 2026

slime is an LLM post-training framework for RL Scaling.

Python 5,575 774 Updated May 6, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 79,198 16,485 Updated May 6, 2026
Python 62 5 Updated Jul 21, 2025

Scalable toolkit for efficient model reinforcement

Python 1,613 368 Updated May 6, 2026

Agentic RL Training at Scale

Python 1,344 282 Updated May 6, 2026

Muon is an optimizer for hidden layers in neural networks

Python 2,549 118 Updated Jan 19, 2026

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 745 44 Updated Jun 6, 2025

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 479 26 Updated May 17, 2025

Democratizing Reinforcement Learning for LLMs

Python 5,477 551 Updated May 6, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,263 721 Updated May 6, 2026

FireFlyer Record file format, writer and reader for DL training samples.

Python 246 25 Updated Dec 1, 2022

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,136 491 Updated May 6, 2026

Official Repo for Open-Reasoner-Zero

Python 2,093 119 Updated Jun 2, 2025
Python 219 9 Updated Feb 20, 2025
Python 1,137 54 Updated Jan 10, 2026
Next