Ecosystem & Stack: sglang

LLM Inference Optimization Engine
GPU
8.0k

LMCache/LMCache

LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput, especially for long-context scenarios, by intelligently reusing KV caches.

LLM Optimization Toolkit
huggingface
1.1k

ModelCloud/GPTQModel

A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.

Reinforcement Learning Library for LLMs
Ray
3.1k

alibaba/ROLL

An efficient and user-friendly library for scaling Reinforcement Learning with Large Language Models on large-scale GPU resources.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.