Tags: #text-to-speech
RunanywhereAI/runanywhere-sdks
A production-ready toolkit enabling developers to integrate private, offline, and fast on-device AI capabilities like LLMs, speech-to-text, and text-to-speech into their applications across various platforms.
FunAudioLLM/CosyVoice
CosyVoice is an advanced multi-lingual large language model-based text-to-speech system offering state-of-the-art voice generation, cloning, and full-stack deployment capabilities.
moonshine-ai/moonshine
An open-source, on-device AI toolkit for real-time, low-latency speech-to-text, intent recognition, and text-to-speech across multiple platforms.
DrewThomasson/ebook2audiobook
Generate audiobooks from e-books with advanced text-to-speech, voice cloning, and extensive language support.
snakers4/silero-models
A collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech across multiple languages.
PaddlePaddle/PaddleSpeech
An easy-to-use open-source toolkit built on PaddlePaddle, offering state-of-the-art models for diverse speech and audio tasks like ASR, TTS, translation, and speaker verification.
rsxdalv/TTS-WebUI
A unified Gradio and React web interface integrating a vast collection of open-source Text-to-Speech, audio generation, and voice conversion AI models.
rany2/edge-tts
Access Microsoft Edge's online text-to-speech service from Python without needing Edge, Windows, or an API key.
index-tts/index-tts
An industrial-level, zero-shot text-to-speech system offering precise duration control and disentangled emotional expression for highly natural and controllable speech synthesis.
CorentinJ/Real-Time-Voice-Cloning
A deep learning framework for real-time voice cloning and text-to-speech synthesis from short audio samples.
denizsafak/abogen
Generate high-quality audiobooks and voiceovers from various text formats with synchronized captions.
babysor/MockingBird
A real-time voice cloning toolkit that allows users to replicate a voice in 5 seconds and generate arbitrary speech.
santinic/audiblez
A Python-based tool to convert e-books (EPUB) into high-quality M4B audiobooks using advanced text-to-speech models.
RVC-Boss/GPT-SoVITS
A powerful web-based tool for few-shot voice cloning and text-to-speech, enabling high-quality voice generation from minimal audio data.
remsky/Kokoro-FastAPI
A Dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model, offering multi-language support, CPU/GPU inference, and an OpenAI-compatible API.
WhisperSpeech/WhisperSpeech
An open-source, high-performance text-to-speech (TTS) system built by inverting OpenAI Whisper, aiming to be the Stable Diffusion for speech.
OpenMOSS/MOSS-TTS
An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.
edwko/OuteTTS
A versatile interface for OuteTTS models, providing flexible text-to-speech generation capabilities across various AI inference backends and hardware platforms.
yakGPT/yakGPT
A locally running, hands-free ChatGPT UI that enhances text generation and chat engagement with speech-to-text and text-to-speech capabilities.
canopyai/Orpheus-TTS
Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3b backbone, aiming to generate human-sounding, emotionally rich speech with low latency.
abus-aikorea/voice-pro
An AI-powered web application for comprehensive multimedia content creation, offering speech recognition, voice cloning, text-to-speech, and multilingual translation.
jianchang512/ChatTTS-ui
Provides a local web interface and API for ChatTTS to synthesize text into speech, supporting mixed Chinese, English, and numbers.
jianchang512/clone-voice
A user-friendly, open-source tool that clones any human voice to generate speech from text or convert existing audio, featuring a web interface and multi-language support.
jing332/tts-server-android
An advanced Android TTS application offering Microsoft TTS integration, custom HTTP requests, local engine support, dialogue recognition, and robust features like auto-retry and text replacement.
rhasspy/piper
A fast, local neural text-to-speech system for generating high-quality speech offline.
myshell-ai/OpenVoice
An open-source AI model for instant, accurate, and flexible voice cloning, supporting cross-lingual synthesis and granular style control.
MoonInTheRiver/DiffSinger
DiffSinger is an official PyTorch implementation of a singing voice synthesis (SVS) and text-to-speech (TTS) system, leveraging a shallow diffusion mechanism for high-quality audio generation.
myshell-ai/MeloTTS
A high-quality, multi-lingual text-to-speech library supporting various languages and accents, optimized for real-time CPU inference.
coqui-ai/TTS
A deep learning toolkit for advanced, multi-language Text-to-Speech generation and voice cloning, suitable for research and production.
netease-youdao/EmotiVoice
EmotiVoice is an open-source, multi-voice, and prompt-controlled text-to-speech engine supporting English and Chinese with emotional synthesis capabilities.
yl4579/StyleTTS2
StyleTTS 2 is a cutting-edge text-to-speech model achieving human-level speech synthesis through style diffusion and adversarial training with large speech language models.
TensorSpeech/TensorFlowTTS
TensorFlowTTS is a real-time, state-of-the-art speech synthesis library built on TensorFlow 2, supporting multiple languages and optimized for efficient deployment.
jaywalnut310/vits
VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.
mozilla/TTS
A deep learning library for advanced, high-quality, and efficient Text-to-Speech (TTS) synthesis, supporting multiple languages and models.
OpenMOSS/MOSS-TTS-Nano
MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model optimized for real-time CPU inference and lightweight integration.