AI Voice Generation Platform
29.8k 2026-04-18
fishaudio/fish-speech
A state-of-the-art open-source multilingual text-to-speech system offering natural, expressive, and emotionally rich voice generation.
Core Features
State-of-the-art multilingual text-to-speech capabilities.
Fine-grained control over prosody and emotion using natural language tags.
Supports multi-speaker and multi-turn conversation generation.
Trained on over 10 million hours of audio data across 80+ languages.
Detailed Introduction
Fish Speech is an advanced open-source AI voice generation platform, featuring the S2 Pro model. It redefines text-to-speech with its Dual-Autoregressive architecture and reinforcement learning alignment, producing exceptionally natural, realistic, and emotionally rich speech. Trained on over 10 million hours of audio data in 80+ languages, it offers sub-word level control of prosody and emotion via natural language tags, alongside native support for multi-speaker and multi-turn conversations, making it a leading solution in voice synthesis.