Tags: #openai-api-compatible
LLM Serving Framework
Docker
12.3k
bentoml/OpenLLM
A framework for easily self-hosting and serving any open-source Large Language Models as OpenAI-compatible API endpoints in the cloud.
Replaces:
Details AI Inference Engine
WebGPU
17.8k
mlc-ai/web-llm
A high-performance, in-browser LLM inference engine with OpenAI API compatibility, leveraging WebGPU for local, private AI.
Replaces:
Details AI Inference Server
4.7k
Michael-A-Kuykendall/shimmy
Shimmy is a Python-free Rust inference server that provides a 100% OpenAI-compatible API for running local Large Language Models (LLMs) with zero dependencies.
Replaces:
Details