Deep Learning Library
45.1k 2026-04-18
coqui-ai/TTS
A deep learning toolkit for Text-to-Speech, offering pretrained models, training tools, and dataset utilities.
Core Features
Extensive multilingual support with over 1100 pretrained models.
Comprehensive toolkit for training and fine-tuning custom TTS models.
Integration with state-of-the-art TTS models like XTTSv2, Bark, and Tortoise.
Utilities for dataset analysis and curation.
Low-latency real-time speech generation (<200ms).
Detailed Introduction
🐸TTS is an advanced deep learning library for Text-to-Speech generation, designed for both research and production environments. It provides a robust toolkit for synthesizing high-quality speech across over 1100 languages, enabling users to leverage pretrained models or train custom ones. The platform also offers essential utilities for dataset management and integrates cutting-edge TTS models like ⓍTTSv2, Bark, and Tortoise, supporting features such as voice cloning and low-latency streaming. This makes it a versatile solution for developers and researchers seeking flexible and powerful speech synthesis capabilities.