Tags: #text-to-speech
OvidijusParsiunas/deep-chat
A highly customizable AI chatbot component designed for easy integration into any website or UI framework.
RunanywhereAI/runanywhere-sdks
A production-ready toolkit enabling developers to integrate private, offline, and fast on-device AI capabilities like LLMs, speech-to-text, and text-to-speech into their applications across various platforms.
jamiepine/voicebox
The open-source, local-first voice synthesis studio for voice cloning, speech generation, and audio effects.
FunAudioLLM/CosyVoice
CosyVoice is an advanced multi-lingual large language model-based text-to-speech system offering state-of-the-art voice generation, cloning, and full-stack deployment capabilities.
moonshine-ai/moonshine
An open-source, on-device AI toolkit for real-time, low-latency speech-to-text, intent recognition, and text-to-speech across multiple platforms.
DrewThomasson/ebook2audiobook
A powerful tool to convert e-books into audiobooks with advanced text-to-speech, voice cloning, and extensive language support.
snakers4/silero-models
Silero Models offers a collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech generation.
rsxdalv/TTS-WebUI
A single web interface integrating numerous state-of-the-art open-source models for text-to-speech, audio generation, and voice conversion.
rany2/edge-tts
A Python module and CLI tool to access Microsoft Edge's online text-to-speech service without an API key, Edge browser, or Windows.
index-tts/index-tts
IndexTTS2 is an industrial-level, zero-shot text-to-speech system offering precise duration control and disentangled emotional expression for highly natural and controllable speech synthesis.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time using a three-stage deep learning framework.
denizsafak/abogen
Generate high-quality audiobooks and voiceovers from various text formats with synchronized captions.
santinic/audiblez
Generate high-quality audiobooks in .m4b format from .epub e-books using advanced text-to-speech technology, with both command-line and graphical interfaces.
RVC-Boss/GPT-SoVITS
A powerful open-source web UI for few-shot voice conversion and text-to-speech, enabling high-quality voice cloning with minimal audio data.
WhisperSpeech/WhisperSpeech
An open-source, high-performance text-to-speech system built on Whisper, aiming to be a hackable and commercially safe alternative for speech generation.
Blaizzy/mlx-audio
A high-performance library built on Apple's MLX framework, offering efficient text-to-speech, speech-to-text, and speech-to-speech capabilities optimized for Apple Silicon.
OpenMOSS/MOSS-TTS
An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.
edwko/OuteTTS
A versatile interface for OuteTTS models, providing flexible text-to-speech generation capabilities across various AI inference backends and hardware platforms.
yakGPT/yakGPT
A locally running, hands-free ChatGPT UI that enhances text generation and chat engagement with speech-to-text and text-to-speech capabilities.
LokerL/tts-vue
A desktop application providing a user-friendly interface for Microsoft's speech synthesis capabilities.
canopyai/Orpheus-TTS
A state-of-the-art open-source text-to-speech system leveraging LLMs to generate human-like, emotional, and low-latency speech with zero-shot voice cloning capabilities.
jianchang512/ChatTTS-ui
Provides a local web interface and API for the ChatTTS model, enabling text-to-speech synthesis with support for mixed languages and numbers.
jianchang512/clone-voice
A user-friendly web-based tool for voice cloning, text-to-speech, and speech-to-speech conversion, leveraging the Coqui XTTS_v2 model with multi-language support.
jing332/tts-server-android
An advanced Android Text-to-Speech (TTS) application offering Microsoft TTS integration, custom HTTP requests, local engine support, and intelligent dialogue recognition.
rhasspy/piper
A fast, local, neural text-to-speech system for efficient and private voice generation.
myshell-ai/OpenVoice
An AI voice synthesis library offering instant, accurate, and flexible voice cloning with multi-lingual support.
MoonInTheRiver/DiffSinger
DiffSinger is an official PyTorch implementation of a singing voice synthesis (SVS) and text-to-speech (TTS) system, leveraging a shallow diffusion mechanism for high-quality audio generation.
myshell-ai/MeloTTS
A high-quality, multi-lingual text-to-speech library supporting real-time CPU inference across various languages and accents.
coqui-ai/TTS
A deep learning toolkit for Text-to-Speech, offering pretrained models, training tools, and dataset utilities.
netease-youdao/EmotiVoice
An open-source, multi-voice, and prompt-controlled text-to-speech engine capable of generating speech with diverse emotions in English and Chinese.
yl4579/StyleTTS2
StyleTTS 2 is a text-to-speech model that achieves human-level speech synthesis by leveraging style diffusion and adversarial training with large speech language models.
TensorSpeech/TensorFlowTTS
TensorFlowTTS provides real-time, state-of-the-art speech synthesis architectures based on TensorFlow 2, supporting multiple languages and optimized for fast inference and deployment on various devices.
jaywalnut310/vits
VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.
mozilla/TTS
A deep learning library for advanced Text-to-Speech generation, offering high-quality speech synthesis with pretrained models and multi-language support.