waqasm86@gmail.com
- Lahore, Pakistan
- https://llamatelemetry.github.io/
- in/mohammad-waqas-3a1384270
- @waqasm86
Pinned Loading
-
llamatelemetry/llamatelemetry
llamatelemetry/llamatelemetry Publicllamatelemetry is a cuda-dedicated llm inference and llm observability tool for local llm model with GGUF format using built-in llama.cpp tool.
Jupyter Notebook
-
-
llcuda/llcuda
llcuda/llcuda PublicCUDA 12-first backend inference for Unsloth on Kaggle — Optimized for small GGUF models (1B-5B) on dual Tesla T4 GPUs (15GB each, SM 7.5)
-
cuda-nvidia-systems-engg
cuda-nvidia-systems-engg PublicProduction-grade C++20/CUDA distributed LLM inference system with TCP networking, MPI scheduling, and content-addressed storage. Features comprehensive benchmarking (p50/p95/p99 latencies), epoll a…
C++
-
llm-observability-stack
llm-observability-stack PublicThis is an opinionated umbrella Helm chart for your local single-node **k3s + NVIDIA GPU + Ollama + Open WebUI + LangChain/LangSmith** setup.
Jupyter Notebook
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.