Ecosystem & Stack: nvidia-gpus
AI/ML Serving Framework
NVIDIA GPUs
26.4k
sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models, optimizing inference throughput and latency.
LLM Inference and Serving Engine
python
78.1k
vllm-project/vllm
vLLM is a high-throughput and memory-efficient open-source library designed for fast and easy serving of large language models.
AI/Deep Learning Optimization Framework
NVIDIA GPUs
41.4k
hpcaitech/ColossalAI
An open-source framework designed to make large AI model training and inference cheaper, faster, and more accessible through advanced distributed computing and memory optimization techniques.
Replaces:
Details AI Speech Synthesis System
Python
16.3k
OpenBMB/VoxCPM
VoxCPM2 is a tokenizer-free, 2B-parameter Text-to-Speech system supporting 30 languages, creative voice design, and controllable voice cloning with 48kHz studio-quality audio output.