Ecosystem & Stack: nvidia-gpus

AI/ML Serving Framework

26.4k

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models, optimizing inference throughput and latency.

llm-serving high-performance multimodal-ai

Details

LLM Inference and Serving Engine

python

78.1k

vllm-project/vllm

vLLM is a high-throughput and memory-efficient open-source library designed for fast and easy serving of large language models.

llm inference serving

Details

AI/Deep Learning Optimization Framework

NVIDIA GPUs

41.4k

hpcaitech/ColossalAI

An open-source framework designed to make large AI model training and inference cheaper, faster, and more accessible through advanced distributed computing and memory optimization techniques.

deep-learning distributed-training llm-optimization

Replaces:

OpenRouter

Details

AI Speech Synthesis System

Python

16.3k

OpenBMB/VoxCPM

VoxCPM2 is a tokenizer-free, 2B-parameter Text-to-Speech system supporting 30 languages, creative voice design, and controllable voice cloning with 48kHz studio-quality audio output.

tokenizer-free-tts multilingual-speech voice-cloning

Replaces:

Commercial Text-to-Speech Services Voice Cloning Platforms

Details