LLM Serving Framework
12.3k 2026-04-14
bentoml/OpenLLM
Self-host and serve any open-source LLM as an OpenAI-compatible API endpoint with ease.
Core Features
Run a wide range of open-source LLMs and custom models
Expose LLMs as OpenAI-compatible API endpoints
Built-in chat UI for interactive exploration
Simplified enterprise-grade cloud deployment with Docker and Kubernetes
State-of-the-art inference backends
Quick Start
pip install openllmDetailed Introduction
OpenLLM is a powerful framework designed to simplify the deployment and serving of large language models. It empowers developers to self-host various open-source LLMs, including custom models, and expose them through OpenAI-compatible API endpoints. With features like a built-in chat UI, advanced inference backends, and streamlined workflows for cloud deployment using Docker, Kubernetes, and BentoCloud, OpenLLM provides an efficient solution for integrating LLMs into enterprise-grade applications.