Yining Li

Research Scientist, Shanghai AI Lab

prof_pic_1.jpg

I am a research scientist at Shanghai Artificial Intelligence Laboratory, and an adjunct Ph.D. supervisor at Shanghai Jiao Tong University. I received my Ph.D. from Multimedia Lab (MMLab) at The Chinese University of Hong Kong in 2019, advised by Prof. Chen Change Loy and Prof. Xiaoou Tang. Before that, I earned my B.S. Degree from Tsinghua University in 2014. Prior to joining Shanghai AI Lab, I was a Senior Research Scientist at SenseTime from 2019 to 2021.

I work on large language models and agentic systems. I am a member of the Intern Large Models team, where we develop foundation models and agentic systems to accelerate scientific discovery. With a background in computer vision, I was a core member of OpenMMLab and led the development of MMPose.

Hiring We have openings for research interns and full-time researchers in LLM training and infrastructure. I am also looking for Ph.D. students (in collaboration with SJTU). Feel free to email me if you are interested.


News

Apr 15, 2026 TREX technical report is available. We introduce TREX, a tree-search enhanced agentic system that automates the challenging process of LLM fine-tuning, along with a new benchmark, FT-Bench, comprising 10 fine-tuning tasks to evaluate automatic research systems. Check out the project page for more information.
Apr 07, 2026 RouteMoA is accepted to ACL 2026.
Mar 30, 2026 Kernel-Smith technical report is online.
Dec 12, 2025 MG-LLaVA is accepted to TCSVT.
Feb 26, 2025 Auto Cherry-Picker is accepted to CVPR 2025.
Feb 16, 2025 MIG is accepted to ACL 2026 Findings.
Jan 23, 2025 RMP-SAM is accepted to ICRL 2025 as an oral presentation.
Jan 15, 2025 InternLM3-8B-Instruct is released, supporting both a normal response mode for general purpose and a deep thinking mode for solving complicated reasoning tasks via long CoT.
Sep 26, 2024 5 papers accepted to NeurIPS 2024, 3 in the main track (MotionBooth, ADC, XComposer2-4KHD) and 2 in the Datasets and Benchmarks track (GTA, MMBench-Video).
Jul 11, 2024 We released RTMW, the newest addition to RTMPose series, which specializes in predicting whole-body 2D and 3D keypoints simultaneously in real time.
Jul 01, 2024 Open-Vocabulary SAM is accepted to ECCV 2024.
May 26, 2024 InternLM2 technical report is online.
Feb 27, 2024 3 papers accepted to CVPR 2024: RTMO, OMG-Seg and ROVI.
Dec 08, 2023 We introduce AgentLego, a modular tool library to equip LLM agents with composable, multi-modal capabilities through standardized tool interfaces.

Selected Publications

  1. LLM/VLM
    GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows
    Jize Wang, Xuanxuan Liu, Yining Li, Songyang Zhang, Yijun Wang, and 5 more authors
    arXiv, 2026
  2. Agent
    RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents
    Jize Wang, Han Wu, Zhiyuan You, Yiming Song, Yijun Wang, and 7 more authors
    ACL, 2026
  3. Agent
    TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
    Zerun Ma, Guoqiang Wang, Xinchen Xie, Yicheng Chen, He Du, and 5 more authors
    arXiv, 2026
  4. LLM/VLM
    DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning
    Yicheng Chen, Zerun Ma, Xinchen Xie, Yining Li, and Kai Chen
    arXiv, 2026
  5. LLM/VLM
    MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
    Yicheng Chen, Yining Li, Kai Hu, Zerun Ma, Haochen Ye, and 1 more author
    In Findings of ACL, 2025
  6. Vision & Multimodality
    Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
    Yicheng Chen, Xiangtai Li, Yining Li, Yanhong Zeng, Jianzong Wu, and 2 more authors
    In CVPR, 2025
  7. Vision & Multimodality
    MotionBooth: Motion-Aware Customized Text-to-Video Generation
    Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, and 3 more authors
    In NeurIPS Spotlight, 2024
  8. LLM/VLM
    InternLM2 Technical Report
    Zhaowei Cai, Ming Cao, Hao Chen, Kai Chen, Kaibo Chen, and 4 more authors
    arXiv, 2024
  9. LLM/VLM
    InternLM-XComposer2: Mastering Free-Form Text-Image Composition and Comprehension in Vision-Language Large Model
    Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yan Cao, Boxiao Wang, and 4 more authors
    arXiv, 2024
  10. Agent
    GTA: A Benchmark for General Tool Agents
    Jize Wang, Zerun Ma, Yining Li, Songyang Zhang, Cailian Chen, and 2 more authors
    In NeurIPS Datasets and Benchmarks Track, 2024
  11. Vision & Multimodality
    RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation
    Peng Lu, Tao Jiang, Yining Li, Xiangtai Li, Kai Chen, and 1 more author
    In CVPR, 2024