Highlights
- Pro
Stars
OfficeCLI is the first and best Office suite purpose-built for AI agents to read, edit, and automate Word, Excel, and PowerPoint files. Free, open-source, single binary, no Office installation requ…
AI agents running research on single-GPU nanochat training automatically
首家工业级全流程 AI 影视生产平台。Industry-first professional AI Agent platform for controllable film & video production. From shorts to live-action with Hollywood-standard workflows.
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
[ICRA 2026] VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
A general physic-based retargeting framework.
Open source tactile glove for robotics research
An open source quadruped robot pet framework for developing Boston Dynamics-style four-legged robots that are perfect for STEM, coding & robotics education, IoT robotics applications, AI-enhanced r…
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
[ICLR'26] IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image (CVPR 2026)
cuVSLAM: CUDA-Accelerated Visual Odometry and Mapping
Fisheye-Calib-Adapter: An Easy Tool for Fisheye Camera Model Conversion
TiDB is built for agentic workloads that grow unpredictably, with ACID guarantees and native support for transactions, analytics, and vector search. No data silos. No noisy neighbors. No infrastruc…
[ICLR 2026] Trace Anything: Representing Any Video in 4D via Trajectory Fields
NEO Series: Native Vision-Language Models from First Principles
A curated list of awesome papers for reconstructing 4D spatial intelligence from video. (arXiv 2507.21045)
[ICRA 2026] GMR: General Motion Retargeting. Retarget human motions into diverse humanoid robots in real time on CPU. Retargeter for TWIST.
[SIGGRAPH Asia 2025 - TOG] Official implementation of MILo: Mesh-In-the-Loop Gaussian Splatting for Detailed and Efficient Surface Reconstruction
Dynamic 3D Foundation Model using Causal Transformer. [ICLR 2026]
Official Implementation of "Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting"