Tags: #speech-synthesis
Baiyuetribe/paper2gui
Converts advanced AI models into user-friendly desktop applications, making cutting-edge AI accessible to everyone without installation.
OpenBMB/VoxCPM
A tokenizer-free, multilingual Text-to-Speech system offering advanced voice design, controllable cloning, and high-quality audio output.
snakers4/silero-models
Silero Models offers a collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech generation.
fishaudio/Bert-VITS2
An open-source Text-to-Speech system built on the VITS2 backbone, enhanced with multilingual BERT for improved speech synthesis.
2noise/ChatTTS
A generative speech model optimized for natural and expressive daily dialogue, especially for LLM assistants.
fishaudio/fish-speech
A state-of-the-art open-source multilingual text-to-speech system offering natural, expressive, and emotionally rich voice generation.
index-tts/index-tts
IndexTTS2 is an industrial-level, zero-shot text-to-speech system offering precise duration control and disentangled emotional expression for highly natural and controllable speech synthesis.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time using a three-stage deep learning framework.
babysor/MockingBird
A powerful open-source toolkit for real-time voice cloning and arbitrary speech generation from text.
OpenMOSS/MOSS-TTS
An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.
canopyai/Orpheus-TTS
A state-of-the-art open-source text-to-speech system leveraging LLMs to generate human-like, emotional, and low-latency speech with zero-shot voice cloning capabilities.
rhasspy/piper
A fast, local, neural text-to-speech system for efficient and private voice generation.
myshell-ai/MeloTTS
A high-quality, multi-lingual text-to-speech library supporting real-time CPU inference across various languages and accents.
netease-youdao/EmotiVoice
An open-source, multi-voice, and prompt-controlled text-to-speech engine capable of generating speech with diverse emotions in English and Chinese.
yl4579/StyleTTS2
StyleTTS 2 is a text-to-speech model that achieves human-level speech synthesis by leveraging style diffusion and adversarial training with large speech language models.
TensorSpeech/TensorFlowTTS
TensorFlowTTS provides real-time, state-of-the-art speech synthesis architectures based on TensorFlow 2, supporting multiple languages and optimized for fast inference and deployment on various devices.
Plachtaa/VALL-E-X
An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.
jaywalnut310/vits
VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.
mozilla/TTS
A deep learning library for advanced Text-to-Speech generation, offering high-quality speech synthesis with pretrained models and multi-language support.