OSS Alternative - Discover Top Open Source Alternatives to Popular Software

remsky/Kokoro-FastAPI

A Dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model, offering multi-language support, CPU/GPU inference, and an OpenAI-compatible API.

Core Features

OpenAI-compatible Speech endpoint for easy integration.

Multi-language support including English, Japanese, and Chinese.

Optimized inference with NVIDIA GPU (PyTorch) or CPU (ONNX).

Phoneme-based audio generation and per-word timestamped captions.

Dockerized deployment with pre-built images for quick setup.

Quick Start

docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:latest

Detailed Introduction

Kokoro-FastAPI provides a high-performance, scalable text-to-speech solution by wrapping the Kokoro-82M model within a Dockerized FastAPI application. It addresses the need for flexible TTS services with its multi-language capabilities and efficient inference across both CPU and GPU architectures. The project simplifies integration through an OpenAI-compatible API, making it an ideal choice for developers building applications requiring advanced speech synthesis, including features like voice mixing and detailed caption generation.