Tags : # voice-cloning

AI Text-to-Speech System

20.8k

FunAudioLLM/CosyVoice

CosyVoice is an advanced multi-lingual large language model-based text-to-speech system offering state-of-the-art voice generation, cloning, and full-stack deployment capabilities.

text-to-speech tts llm

Replaces:

text-to-speech audiobook-generator voice-cloning

E-book to Audiobook Converter

Docker

18.8k

DrewThomasson/ebook2audiobook

Generate audiobooks from e-books with advanced text-to-speech, voice cloning, and extensive language support.

AI Speech Synthesis System

tokenizer-free-tts multilingual-speech voice-cloning

16.3k

OpenBMB/VoxCPM

VoxCPM2 is a tokenizer-free, 2B-parameter Text-to-Speech system supporting 30 languages, creative voice design, and controllable voice cloning with 48kHz studio-quality audio output.

Replaces:

Commercial Text-to-Speech Services Voice Cloning Platforms

Deep Learning Framework

voice-cloning speech-synthesis deep-learning

59.7k

CorentinJ/Real-Time-Voice-Cloning

A deep learning framework for real-time voice cloning and text-to-speech synthesis from short audio samples.

AI Voice Cloning Toolkit

voice-cloning text-to-speech real-time

36.9k

babysor/MockingBird

A real-time voice cloning toolkit that allows users to replicate a voice in 5 seconds and generate arbitrary speech.

Replaces:

ElevenLabs Google Cloud Text-to-Speech...

AI Voice Synthesis WebUI

text-to-speech voice-cloning few-shot-learning

57.1k

RVC-Boss/GPT-SoVITS

A powerful web-based tool for few-shot voice cloning and text-to-speech, enabling high-quality voice generation from minimal audio data.

Replaces:

Commercial Text-to-Speech Services Voice Cloning Software

text-to-speech ai open-source

AI Speech Synthesis System

4.6k

WhisperSpeech/WhisperSpeech

An open-source, high-performance text-to-speech (TTS) system built by inverting OpenAI Whisper, aiming to be the Stable Diffusion for speech.

speech synthesis sound generation text-to-speech

AI Speech and Sound Generation Framework

llama.cpp

1.5k

OpenMOSS/MOSS-TTS

An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.

Replaces:

AI-powered Text-to-Speech System

text-to-speech speech-synthesis ai

6.1k

canopyai/Orpheus-TTS

Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3b backbone, aiming to generate human-sounding, emotionally rich speech with low latency.

ai voice speech recognition text-to-speech

AI Multimedia Processing Web Application

Gradio

7.2k

abus-aikorea/voice-pro

An AI-powered web application for comprehensive multimedia content creation, offering speech recognition, voice cloning, text-to-speech, and multilingual translation.

Replaces:

ElevenLabs

AI Voice Cloning Tool with Web UI

voice cloning text-to-speech speech-to-speech

8.9k

jianchang512/clone-voice

A user-friendly, open-source tool that clones any human voice to generate speech from text or convert existing audio, featuring a web interface and multi-language support.

Replaces:

ElevenLabs Descript Overdub

voice cloning text-to-speech ai

AI Voice Cloning Framework

python

36.5k

myshell-ai/OpenVoice

An open-source AI model for instant, accurate, and flexible voice cloning, supporting cross-lingual synthesis and granular style control.

Deep Learning Library

deep-learning text-to-speech voice-synthesis

45.2k

coqui-ai/TTS

A deep learning toolkit for advanced, multi-language Text-to-Speech generation and voice cloning, suitable for research and production.

Replaces:

tts voice cloning deep learning

Text-to-Speech (TTS) Foundational Model

Docker

4.2k

metavoiceio/metavoice-src

MetaVoice-1B is an open-source 1.2B parameter foundational model for highly expressive, human-like text-to-speech synthesis with advanced voice cloning capabilities.

Replaces:

AI Model Implementation / Text-to-Speech System

tts voice-cloning multilingual

7.9k

Plachtaa/VALL-E-X

An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.

Replaces:

ElevenLabs Commercial Text-to-Speech (TTS) services