OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Michael-A-Kuykendall/shimmy

Shimmy is a Python-free Rust inference server that provides a 100% OpenAI-compatible API for running local Large Language Models (LLMs) with zero dependencies.

Core Features

OpenAI API compatibility for local LLMs

Single binary, Python-free Rust implementation

Supports GGUF and SafeTensors models

Automatic model discovery and port allocation

Advanced Mixture of Experts (MOE) support for large models

Quick Start

curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy-linux-x86_64 -o shimmy && chmod +x shimmy && ./shimmy serve &

Detailed Introduction

Shimmy offers a lightweight, dependency-free solution for running Large Language Models locally, acting as a drop-in replacement for the OpenAI API. Built with Rust, it compiles into a single binary, eliminating Python dependencies and simplifying deployment. It automatically discovers GGUF and SafeTensors models, provides hot model swapping, and supports advanced features like MOE for efficient execution of large models on consumer hardware. This enables developers to integrate local LLMs into existing OpenAI-compatible tools and applications with minimal configuration, ensuring privacy and cost-effectiveness.