AI-powered Text-to-Speech Engine
8.5k 2026-04-18

netease-youdao/EmotiVoice

An open-source, multi-voice, and prompt-controlled text-to-speech engine capable of generating speech with diverse emotions in English and Chinese.

Core Features

Multi-voice synthesis with over 2000 distinct voices.
Emotional speech generation (happy, excited, sad, angry).
Supports both English and Chinese languages.
Provides an easy-to-use web interface and scripting API.
Voice cloning capabilities with personal data.
OpenAI-compatible-TTS API with speed tuning.

Quick Start

docker run -dp 127.0.0.1:8501:8501 -p 127.0.0.1:8000:8000 syq163/emoti-voice:latest

Detailed Introduction

EmotiVoice is a robust open-source text-to-speech engine developed by NetEase Youdao, offering advanced capabilities for generating natural-sounding speech. It stands out with its extensive library of over 2000 voices and its unique ability to synthesize speech with specific emotions, controlled via prompts. Supporting both English and Chinese, EmotiVoice provides flexible interfaces, including a web UI, scripting API, and an OpenAI-compatible API, making it suitable for a wide range of applications from content creation to accessibility tools. Recent updates include voice cloning and a dedicated Mac app, enhancing its versatility and user experience.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.