Skip to content
View cubicc's full-sized avatar
  • ByteDance
  • Beijing,China
  • 07:05 (UTC +08:00)

Block or report cubicc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Letting AI Actively Manage Its Own Context

TypeScript 179 13 Updated Apr 17, 2026

A framework for efficient model inference with omni-modality models

Python 4,591 881 Updated May 6, 2026

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程

Jupyter Notebook 30,277 2,983 Updated Apr 24, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,746 1,089 Updated May 6, 2026

Fast CUDA matrix multiplication from scratch

Cuda 1,171 179 Updated Sep 2, 2025

The "Small Vision-Language Model" (SVLM) is a compact multimodal model tailored for beginners or users with limited computational resources. Its main goal is to optimize the integration of visual a…

Python 13 Updated Sep 1, 2025

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 4,160 313 Updated Apr 24, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,121 617 Updated Mar 13, 2026

Nano vLLM

Python 13,268 2,044 Updated Apr 26, 2026

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Python 3,446 261 Updated Oct 18, 2024

A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search sc…

C++ 4,990 617 Updated May 6, 2026

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

Jupyter Notebook 116 40 Updated Apr 24, 2026
C++ 22 1 Updated Aug 30, 2025

vsag is a vector indexing library used for similarity search.

C++ 470 92 Updated May 6, 2026

Vector search engine inside Milvus, integrating FAISS, HNSW, DiskANN.

C++ 347 140 Updated May 6, 2026

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …

Go 16,140 1,271 Updated May 6, 2026

Exercises in C - used at University of Bristol in COMSM1201

TeX 62 34 Updated Dec 5, 2025

💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline

Python 64,893 4,709 Updated Mar 23, 2026

moffee: Make Markdown Ready to Present

Python 1,336 63 Updated Aug 2, 2025

技术面试最后反问面试官的话

18,450 1,380 Updated Mar 4, 2024

Function Composition for OpenFaaS

Go 261 38 Updated Mar 26, 2023

Train Ticket - A Benchmark Microservice System

Java 883 304 Updated Nov 21, 2025

FunctionBench : A Suite of Workloads for Serverless Cloud Function Service

Python 147 49 Updated Jun 17, 2024

Automatic resource configuration for serverless workflows.

Python 21 3 Updated Mar 24, 2024

A new engine for Durable Functions. https://microsoft.github.io/durabletask-netherite

C# 238 34 Updated May 6, 2026

Virtual Memory Abstraction for Serverless Architectures

C++ 49 15 Updated Mar 18, 2022

Thread pool implementation using c++11 threads

C++ 1,233 242 Updated Mar 1, 2024
Next