Ecosystem & Stack: sglang

LLM Inference Optimization Engine

8.1k

LMCache/LMCache

LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput by intelligently reusing KV caches across various storage tiers and serving instances.

llm kv-cache inference

Details

LLM Optimization Toolkit

huggingface

1.1k

ModelCloud/GPTQModel

A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.

llm quantization compression

Details

Reinforcement Learning Library for LLMs

Ray

3.1k

alibaba/ROLL

An efficient and user-friendly scaling library designed to optimize Reinforcement Learning with Large Language Models, enhancing performance in complex AI tasks.

reinforcement-learning large-language-models distributed-training

Details

AI-powered Text-to-Speech System

Docker

30.0k

fishaudio/fish-speech

A state-of-the-art open-source multilingual text-to-speech system offering exceptionally natural, realistic, and emotionally rich voice generation.

tts ai multilingual

Details