LLM Caching Library
8.0k 2026-04-13

zilliztech/GPTCache

A semantic cache library for Large Language Models (LLMs) designed to significantly reduce API costs and accelerate response times.

Core Features

Semantic caching for LLM queries and responses.
Deep integration with popular LLM frameworks like LangChain and LlamaIndex.
Achieves up to 10x reduction in LLM API costs.
Provides up to 100x speed improvement for LLM interactions.
Offers a Docker image for language-agnostic deployment.

Quick Start

pip install gptcache

Detailed Introduction

GPTCache addresses the critical challenges of high costs and slow response times associated with Large Language Model (LLM) API calls. By implementing a semantic cache, it intelligently stores and retrieves LLM responses, preventing redundant queries. This innovative approach not only slashes operational expenses by reducing API usage but also dramatically enhances application performance, making LLM-powered applications more scalable and efficient. Its seamless integration with leading frameworks like LangChain and LlamaIndex ensures easy adoption and broad applicability across various AI projects.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.