Ecosystem & Stack: sglang
LLM Inference Optimization Engine
vllm
8.1k
LMCache/LMCache
LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput by intelligently reusing KV caches across various storage tiers and serving instances.
LLM Optimization Toolkit
huggingface
1.1k
ModelCloud/GPTQModel
A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.
Reinforcement Learning Library for LLMs
Ray
3.1k
alibaba/ROLL
An efficient and user-friendly scaling library designed to optimize Reinforcement Learning with Large Language Models, enhancing performance in complex AI tasks.
AI-powered Text-to-Speech System
Docker
30.0k
fishaudio/fish-speech
A state-of-the-art open-source multilingual text-to-speech system offering exceptionally natural, realistic, and emotionally rich voice generation.