Tags: #speech-synthesis
jamiepine/voicebox
An open-source, local-first AI voice studio offering voice cloning, speech generation, and dictation with complete privacy.
Baiyuetribe/paper2gui
Paper2GUI converts complex AI research papers into user-friendly, install-free desktop applications, making advanced AI accessible to everyone.
snakers4/silero-models
A collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech across multiple languages.
fishaudio/Bert-VITS2
An open-source text-to-speech model that combines the VITS2 backbone with multilingual BERT for high-quality, multi-language speech synthesis.
2noise/ChatTTS
A generative speech model optimized for natural, expressive dialogue in LLM assistants, featuring fine-grained prosodic control.
CorentinJ/Real-Time-Voice-Cloning
A deep learning framework for real-time voice cloning and text-to-speech synthesis from short audio samples.
babysor/MockingBird
A real-time voice cloning toolkit that allows users to replicate a voice in 5 seconds and generate arbitrary speech.
OpenMOSS/MOSS-TTS
An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.
LokerL/tts-vue
A cross-platform desktop application for Microsoft Edge text-to-speech synthesis, built with Electron and Vue.
canopyai/Orpheus-TTS
Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3b backbone, aiming to generate human-sounding, emotionally rich speech with low latency.
rhasspy/piper
A fast, local neural text-to-speech system for generating high-quality speech offline.
yl4579/StyleTTS2
StyleTTS 2 is a cutting-edge text-to-speech model achieving human-level speech synthesis through style diffusion and adversarial training with large speech language models.
metavoiceio/metavoice-src
MetaVoice-1B is an open-source 1.2B parameter foundational model for highly expressive, human-like text-to-speech synthesis with advanced voice cloning capabilities.
TensorSpeech/TensorFlowTTS
TensorFlowTTS is a real-time, state-of-the-art speech synthesis library built on TensorFlow 2, supporting multiple languages and optimized for efficient deployment.
jaywalnut310/vits
VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.
mozilla/TTS
A deep learning library for advanced, high-quality, and efficient Text-to-Speech (TTS) synthesis, supporting multiple languages and models.