gongel

🎯

Focusing

gongel gongel

🎯

Focusing

77 followers · 54 following

Beijing

Achievements

x2 x3

Achievements

x2 x3

Organizations

Lists (1)

Sort

🔮 Future ideas

Stars

NousResearch / hermes-agent

The agent that grows with you

Python 135,798 20,801 Updated May 6, 2026

NanmiCoder / cc-haha

Claude Code 泄露源码 - 本地可运行版本，新增跨平台桌面端软件补齐Computer Use（附带核心模块解析）

TypeScript 9,773 7,614 Updated May 5, 2026

Gen-Verse / OpenClaw-RL

OpenClaw-RL: Train any agent simply by talking

Python 5,238 563 Updated Apr 30, 2026

anomalyco / opencode

The open source coding agent.

TypeScript 155,841 18,070 Updated May 6, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,121 617 Updated Mar 13, 2026

NovaSky-AI / SkyRL

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,809 315 Updated May 6, 2026

AMAP-ML / Tree-GRPO

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 349 33 Updated Jan 26, 2026

WooooDyy / BAPO

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 92 6 Updated Jan 29, 2026

Alibaba-NLP / DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,802 1,447 Updated Feb 27, 2026

GuoqingWang1 / IGPO

[ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents

Python 61 3 Updated Apr 23, 2026

RUC-NLPIR / ARPO

[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)

Python 976 51 Updated Apr 13, 2026

NVlabs / QeRL

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 502 51 Updated Mar 30, 2026

ISEEKYAN / mbridge

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 211 71 Updated May 6, 2026

MemTensor / MemOS

Self-evolving memory OS for LLM & AI Agents: ultra-persistent memory, hybrid-retrieval, and cross-task skill reuse, with 35.24% token savings

TypeScript 8,933 795 Updated May 6, 2026

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 5,575 774 Updated May 6, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 79,198 16,485 Updated May 6, 2026

hkgc-1 / GHPO

Python 62 5 Updated Jul 21, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,613 368 Updated May 6, 2026

PrimeIntellect-ai / prime-rl

Agentic RL Training at Scale

Python 1,344 282 Updated May 6, 2026

KellerJordan / Muon

Muon is an optimizer for hidden layers in neural networks

Python 2,549 118 Updated Jan 19, 2026

SkyworkAI / Skywork-OR1

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 745 44 Updated Jun 6, 2025

QwenLM / ParScale

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 479 26 Updated May 17, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 5,477 551 Updated May 6, 2026

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,263 721 Updated May 6, 2026

HFAiLab / ffrecord

FireFlyer Record file format, writer and reader for DL training samples.

Python 246 25 Updated Dec 1, 2022

inclusionAI / AReaL

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,136 491 Updated May 6, 2026

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 2,093 119 Updated Jun 2, 2025

GAIR-NLP / LIMR

Python 219 9 Updated Feb 20, 2025

huggingface / Math-Verify

Python 1,137 54 Updated Jan 10, 2026

GuanghaoYe / Emergence-of-Thinking

Forked from OpenRLHF/OpenRLHF

Python 54 4 Updated Feb 11, 2025