AI Text-to-Speech API Service
4.7k 2026-04-18

remsky/Kokoro-FastAPI

A Dockerized FastAPI wrapper providing a high-performance, multi-platform (CPU/GPU) and multi-language API for the Kokoro-82M text-to-speech model, compatible with OpenAI's speech endpoint.

Core Features

OpenAI-compatible Speech API endpoint
Multi-language support (English, Japanese, Chinese)
NVIDIA GPU acceleration and CPU inference (PyTorch/ONNX)
Phoneme-based audio and per-word timestamped caption generation
Voice mixing with weighted combinations

Quick Start

docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:latest

Detailed Introduction

Kokoro-FastAPI is an efficient, Dockerized solution that wraps the Kokoro-82M text-to-speech model with a FastAPI interface. It offers robust multi-language support and flexible inference options, leveraging both NVIDIA GPUs with PyTorch and CPU with ONNX. Designed for ease of deployment and integration, it provides an OpenAI-compatible speech endpoint, making it a powerful tool for developers seeking a self-hosted, high-performance TTS solution with advanced features like phoneme generation, captioning, and voice mixing.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.