Tags : # text-to-speech

AI Text-to-Speech System

20.8k

FunAudioLLM/CosyVoice

CosyVoice is an advanced multi-lingual large language model-based text-to-speech system offering state-of-the-art voice generation, cloning, and full-stack deployment capabilities.

text-to-speech tts llm

Replaces:

voice-ai speech-to-text text-to-speech

Voice AI Toolkit

7.9k

moonshine-ai/moonshine

An open-source, on-device AI toolkit for real-time, low-latency speech-to-text, intent recognition, and text-to-speech across multiple platforms.

text-to-speech audiobook-generator voice-cloning

E-book to Audiobook Converter

Docker

18.8k

DrewThomasson/ebook2audiobook

Generate audiobooks from e-books with advanced text-to-speech, voice cloning, and extensive language support.

Speech Synthesis Library

pytorch

5.9k

snakers4/silero-models

A collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech across multiple languages.

Replaces:

speech recognition text-to-speech speech translation

Speech AI Toolkit

PaddlePaddle

12.6k

PaddlePaddle/PaddleSpeech

An easy-to-use open-source toolkit built on PaddlePaddle, offering state-of-the-art models for diverse speech and audio tasks like ASR, TTS, translation, and speaker verification.

AI/ML Audio Platform

text-to-speech audio-generation voice-conversion

3.1k

rsxdalv/TTS-WebUI

A unified Gradio and React web interface integrating a vast collection of open-source Text-to-Speech, audio generation, and voice conversion AI models.

Replaces:

ElevenLabs Google Cloud Text-to-Speech...

CLI Tool & Python Library

text-to-speech tts python

10.7k

rany2/edge-tts

Access Microsoft Edge's online text-to-speech service from Python without needing Edge, Windows, or an API key.

Text-to-Speech (TTS) System

20.3k

index-tts/index-tts

An industrial-level, zero-shot text-to-speech system offering precise duration control and disentangled emotional expression for highly natural and controllable speech synthesis.

Replaces:

Deep Learning Framework

voice-cloning speech-synthesis deep-learning

59.7k

CorentinJ/Real-Time-Voice-Cloning

A deep learning framework for real-time voice cloning and text-to-speech synthesis from short audio samples.

Content Creation Tool

text-to-speech audiobook-generator epub-pdf-converter

4.3k

denizsafak/abogen

Generate high-quality audiobooks and voiceovers from various text formats with synchronized captions.

AI Voice Cloning Toolkit

voice-cloning text-to-speech real-time

36.9k

babysor/MockingBird

A real-time voice cloning toolkit that allows users to replicate a voice in 5 seconds and generate arbitrary speech.

Replaces:

ElevenLabs Google Cloud Text-to-Speech...

Audiobook Generation Tool

audiobook-generation text-to-speech epub-converter

6.4k

santinic/audiblez

A Python-based tool to convert e-books (EPUB) into high-quality M4B audiobooks using advanced text-to-speech models.

Replaces:

Audible Commercial Audiobook Services

AI Voice Synthesis WebUI

text-to-speech voice-cloning few-shot-learning

57.1k

RVC-Boss/GPT-SoVITS

A powerful web-based tool for few-shot voice cloning and text-to-speech, enabling high-quality voice generation from minimal audio data.

Replaces:

Commercial Text-to-Speech Services Voice Cloning Software

fastapi text-to-speech docker

Text-to-Speech API Server

Docker

4.8k

remsky/Kokoro-FastAPI

A Dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model, offering multi-language support, CPU/GPU inference, and an OpenAI-compatible API.

Replaces:

OpenAI Speech API

text-to-speech ai open-source

AI Speech Synthesis System

4.6k

WhisperSpeech/WhisperSpeech

An open-source, high-performance text-to-speech (TTS) system built by inverting OpenAI Whisper, aiming to be the Stable Diffusion for speech.

speech synthesis sound generation text-to-speech

AI Speech and Sound Generation Framework

llama.cpp

1.5k

OpenMOSS/MOSS-TTS

An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.

Replaces:

AI/ML Library & SDK

text-to-speech ai-inference python-library

1.4k

edwko/OuteTTS

A versatile interface for OuteTTS models, providing flexible text-to-speech generation capabilities across various AI inference backends and hardware platforms.

Replaces:

chatgpt openai-api speech-to-text

AI Chat Client

Git

1.6k

yakGPT/yakGPT

A locally running, hands-free ChatGPT UI that enhances text generation and chat engagement with speech-to-text and text-to-speech capabilities.

Replaces:

ChatGPT

AI-powered Text-to-Speech System

text-to-speech speech-synthesis ai

6.1k

canopyai/Orpheus-TTS

Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3b backbone, aiming to generate human-sounding, emotionally rich speech with low latency.

ai voice speech recognition text-to-speech

AI Multimedia Processing Web Application

Gradio

7.2k

abus-aikorea/voice-pro

An AI-powered web application for comprehensive multimedia content creation, offering speech recognition, voice cloning, text-to-speech, and multilingual translation.

Replaces:

ElevenLabs

Text-to-Speech Web Interface

chattts text-to-speech web-ui

7.5k

jianchang512/ChatTTS-ui

Provides a local web interface and API for ChatTTS to synthesize text into speech, supporting mixed Chinese, English, and numbers.

AI Voice Cloning Tool with Web UI

voice cloning text-to-speech speech-to-speech

8.9k

jianchang512/clone-voice

A user-friendly, open-source tool that clones any human voice to generate speech from text or convert existing audio, featuring a web interface and multi-language support.

Replaces:

ElevenLabs Descript Overdub

android tts text-to-speech

Android Application

Android

4.4k

jing332/tts-server-android

An advanced Android TTS application offering Microsoft TTS integration, custom HTTP requests, local engine support, dialogue recognition, and robust features like auto-retry and text replacement.

text-to-speech tts neural-network

Speech Synthesis System

10.9k

rhasspy/piper

A fast, local neural text-to-speech system for generating high-quality speech offline.

Replaces:

AI Voice Cloning Framework

voice cloning text-to-speech ai

36.5k

myshell-ai/OpenVoice

An open-source AI model for instant, accurate, and flexible voice cloning, supporting cross-lingual synthesis and granular style control.

Audio Synthesis Framework

singing-voice-synthesis text-to-speech diffusion-models

4.8k

MoonInTheRiver/DiffSinger

DiffSinger is an official PyTorch implementation of a singing voice synthesis (SVS) and text-to-speech (TTS) system, leveraging a shallow diffusion mechanism for high-quality audio generation.

Text-to-Speech Library

text-to-speech tts multi-lingual

7.4k

myshell-ai/MeloTTS

A high-quality, multi-lingual text-to-speech library supporting various languages and accents, optimized for real-time CPU inference.

Replaces:

Deep Learning Library

deep-learning text-to-speech voice-synthesis

45.2k

coqui-ai/TTS

A deep learning toolkit for advanced, multi-language Text-to-Speech generation and voice cloning, suitable for research and production.

Replaces:

AI Text-to-Speech Engine

Docker

8.5k

netease-youdao/EmotiVoice

EmotiVoice is an open-source, multi-voice, and prompt-controlled text-to-speech engine supporting English and Chinese with emotional synthesis capabilities.

Replaces:

AI/ML Model & Speech Synthesis Library

6.2k

yl4579/StyleTTS2

StyleTTS 2 is a cutting-edge text-to-speech model achieving human-level speech synthesis through style diffusion and adversarial training with large speech language models.

speech-synthesis text-to-speech tensorflow

Speech Synthesis Library

TensorFlow 2

4.0k

TensorSpeech/TensorFlowTTS

TensorFlowTTS is a real-time, state-of-the-art speech synthesis library built on TensorFlow 2, supporting multiple languages and optimized for efficient deployment.

Replaces:

Commercial Text-to-Speech APIs

Speech Synthesis Library

7.8k

jaywalnut310/vits

VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.

text-to-speech deep-learning speech-synthesis

Deep Learning Library

pytorch

10.1k

mozilla/TTS

A deep learning library for advanced, high-quality, and efficient Text-to-Speech (TTS) synthesis, supporting multiple languages and models.

text-to-speech tts multilingual

Text-to-Speech Model

onnx

3.0k

OpenMOSS/MOSS-TTS-Nano

MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model optimized for real-time CPU inference and lightweight integration.