Ecosystem & Stack: weka

LLM Inference Optimization Engine

8.1k

LMCache/LMCache

LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput by intelligently reusing KV caches across various storage tiers and serving instances.

llm kv-cache inference

Details