Tags : # tts

AI-powered Video Content Automation Tool

docker

8.9k

linyqh/NarratoAI

Leveraging AI models for one-click video commentary and editing, enabling efficient content creation.

ai video-editing llm

Visual LLM Workflow Builder

ComfyUI

2.2k

heshengtao/comfyui_LLM_party

A ComfyUI-based framework for building comprehensive LLM agent workflows, integrating diverse AI models, tools, and social platforms.

llm agent comfyui rag

AI Text-to-Speech System

Python

20.8k

FunAudioLLM/CosyVoice

CosyVoice is an advanced multi-lingual large language model-based text-to-speech system offering state-of-the-art voice generation, cloning, and full-stack deployment capabilities.

text-to-speech tts llm

Replaces:

Speech AI Framework

generative ai llm speech ai

17.2k

NVIDIA-NeMo/NeMo

A scalable generative AI framework for researchers and developers focused on Large Language Models, Multimodal, and Speech AI (ASR, TTS).

Speech Synthesis Library

pytorch

5.9k

snakers4/silero-models

A collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech across multiple languages.

Replaces:

Text-to-Speech (TTS) Model

tts speech-synthesis bert

8.7k

fishaudio/Bert-VITS2

An open-source text-to-speech model that combines the VITS2 backbone with multilingual BERT for high-quality, multi-language speech synthesis.

Generative AI Text-to-Speech Model

tts generative-ai dialogue

39.2k

2noise/ChatTTS

A generative speech model optimized for natural, expressive dialogue in LLM assistants, featuring fine-grained prosodic control.

AI-powered Text-to-Speech System

Docker

30.0k

fishaudio/fish-speech

A state-of-the-art open-source multilingual text-to-speech system offering exceptionally natural, realistic, and emotionally rich voice generation.

tts ai multilingual

CLI Tool & Python Library

text-to-speech tts python

10.7k

rany2/edge-tts

Access Microsoft Edge's online text-to-speech service from Python without needing Edge, Windows, or an API key.

Text-to-Speech (TTS) System

20.3k

index-tts/index-tts

An industrial-level, zero-shot text-to-speech system offering precise duration control and disentangled emotional expression for highly natural and controllable speech synthesis.

Replaces:

AI/ML Audio Processing Library

6.9k

Blaizzy/mlx-audio

An efficient audio processing library built on Apple's MLX framework, enabling fast text-to-speech, speech-to-text, and speech-to-speech capabilities on Apple Silicon devices.

mlx apple silicon tts

Commercial Text-to-Speech Software

Desktop Application

Node.js

6.1k

LokerL/tts-vue

A cross-platform desktop application for Microsoft Edge text-to-speech synthesis, built with Electron and Vue.

tts electron vue

Replaces:

android tts text-to-speech

Android Application

Android

4.4k

jing332/tts-server-android

An advanced Android TTS application offering Microsoft TTS integration, custom HTTP requests, local engine support, dialogue recognition, and robust features like auto-retry and text replacement.

text-to-speech tts neural-network

Speech Synthesis System

10.9k

rhasspy/piper

A fast, local neural text-to-speech system for generating high-quality speech offline.

Replaces:

llm ai-sales live-streaming

AI-powered Live Streaming Sales Assistant Platform

docker

3.7k

PeterH0323/Streamer-Sales

Streamer-Sales is an AI large language model designed for live streaming sales, generating compelling product descriptions and integrating advanced features like digital human generation, RAG, TTS, ASR, and Agent capabilities.

Text-to-Speech Library

text-to-speech tts multi-lingual

7.4k

myshell-ai/MeloTTS

A high-quality, multi-lingual text-to-speech library supporting various languages and accents, optimized for real-time CPU inference.

Replaces:

AI Text-to-Speech Engine

Docker

8.5k

netease-youdao/EmotiVoice

EmotiVoice is an open-source, multi-voice, and prompt-controlled text-to-speech engine supporting English and Chinese with emotional synthesis capabilities.

Replaces:

AI/ML Model & Speech Synthesis Library

Python

6.2k

yl4579/StyleTTS2

StyleTTS 2 is a cutting-edge text-to-speech model achieving human-level speech synthesis through style diffusion and adversarial training with large speech language models.

tts voice cloning deep learning

Text-to-Speech (TTS) Foundational Model

Docker

4.2k

metavoiceio/metavoice-src

MetaVoice-1B is an open-source 1.2B parameter foundational model for highly expressive, human-like text-to-speech synthesis with advanced voice cloning capabilities.

Replaces:

tts voice-cloning multilingual

AI Model Implementation / Text-to-Speech System

Python

7.9k

Plachtaa/VALL-E-X

An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.

Replaces:

ElevenLabs Commercial Text-to-Speech (TTS) services

Speech Synthesis Library

7.8k

jaywalnut310/vits

VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.

tts on-device multilingual

On-Device Multilingual Text-to-Speech System

ONNX Runtime

7.1k

supertone-inc/supertonic

Supertonic is a lightning-fast, on-device, multilingual text-to-speech system offering high-quality audio and privacy without cloud dependencies.

Replaces:

Cloud-based Text-to-Speech Services

text-to-speech tts multilingual

Text-to-Speech Model

onnx

3.0k

OpenMOSS/MOSS-TTS-Nano

MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model optimized for real-time CPU inference and lightweight integration.