Tags: #inference - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Tags: #inference

Local AI Development Platform
python
62.9k

unslothai/unsloth

Unsloth Studio is a web UI that enables efficient local training and inference of open-source large language models and other AI models with significant VRAM and speed optimizations.

LLM Inference and Serving Engine
python
78.1k

vllm-project/vllm

vLLM is a high-throughput and memory-efficient open-source library designed for fast and easy serving of large language models.

AI Model Inference Serving Platform
Python
9.3k

xorbitsai/inference

A unified, production-ready inference API for deploying and serving open-source language, speech, and multimodal AI models on various infrastructures.

AI/ML Developer Resource
python
18.3k

meta-llama/llama-cookbook

An official guide and collection of recipes for building applications with the Llama model family, covering inference, fine-tuning, and RAG.

AI Model Serving Tool
Podman
2.8k

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving and production inference of AI models by leveraging familiar container technology.

AI/ML Deployment Toolkit
python
3.7k

PaddlePaddle/FastDeploy

A high-performance inference and deployment toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs) based on PaddlePaddle.

LLM Inference Optimization Engine
vllm
8.1k

LMCache/LMCache

LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput by intelligently reusing KV caches across various storage tiers and serving instances.

Technical Guide & Knowledge Base
cloud computing
17.8k

stas00/ml-engineering

An open collection of methodologies, tools, and step-by-step instructions for successful training, fine-tuning, and inference of large language and multi-modal models.

Curated Resource List
3.8k

mnfst/awesome-free-llm-apis

A comprehensive list of Large Language Model (LLM) APIs offering permanent free tiers for text inference, including provider and third-party inference services.

AI Inference Engine
WebGPU
17.8k

mlc-ai/web-llm

A high-performance, in-browser LLM inference engine with OpenAI API compatibility, leveraging WebGPU for local, private AI.

Curated Resource List
19.6k

cheahjs/free-llm-api-resources

A comprehensive list of free and trial-based LLM inference resources accessible via API.

Serverless AI Runtime
Python
1.6k

beam-cloud/beta9

An ultrafast, open-source Pythonic runtime for deploying and scaling serverless GPU inference, sandboxes, and background jobs with zero infrastructure overhead.

Replaces:
Details
AI/ML Inference Serving Framework
Hugging Face
4.6k

vllm-project/vllm-omni

A framework for efficient, fast, and cheap serving of omni-modality (text, image, video, audio) AI models.

AI/ML Inference Library
c++
2.1k

vitoplantamura/OnnxStream

A lightweight C++ inference library designed to run large ONNX-based AI models like Stable Diffusion XL and Mistral 7B on resource-constrained devices with minimal memory footprint.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.