Tags: #model-serving

AI/ML Model Serving Framework
Python
8.6k

bentoml/BentoML

A Python library for building and deploying high-performance AI model inference APIs and multi-model serving systems with ease.

Local LLM Runtime and Serving Platform
Docker
168.8k

ollama/ollama

Easily run, manage, and interact with open-source large language models locally on your machine.

AI Model Serving Platform
Python
9.2k

xorbitsai/inference

A unified, production-ready inference API for effortlessly deploying and serving open-source language, speech, and multimodal AI models across various environments.

AI Development & Deployment Platform
pip
25.9k

modular/modular

A unified, open platform for accelerating AI model serving and scaling GenAI deployments with industry-leading performance across various hardware.

Developer Tool for AI Model Serving
docker
2.7k

containers/ramalama

RamaLama simplifies the local serving and production inference of AI models from any source by leveraging familiar container patterns, eliminating complex host system configurations.

MLOps Platform
Python
6.6k

clearml/clearml

ClearML is an open-source MLOps/LLMOps solution that streamlines the entire AI workflow, from experiment management and data versioning to pipeline orchestration and model serving.

MLOps Platform
Kubernetes
4.7k

SeldonIO/seldon-core

An MLOps and LLMOps framework for deploying, managing, and scaling modular, data-centric AI applications and models on Kubernetes.

LLM Inference Server
macOS
9.9k

jundot/omlx

An optimized LLM inference server for Apple Silicon, featuring continuous batching, tiered KV caching, and macOS menu bar management for efficient local AI.

LLM Inference Server
Docker
3.8k

predibase/lorax

A multi-LoRA inference server designed to efficiently serve thousands of fine-tuned Large Language Models on a single GPU, drastically cutting serving costs while maintaining high throughput and low latency.

Multimodal AI Inference and Serving Framework
python
4.4k

vllm-project/vllm-omni

vLLM-Omni is an efficient, flexible, and easy-to-use framework extending vLLM to serve omni-modality models (text, image, video, audio) with high throughput and an OpenAI-compatible API.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.