Tags: #multilingual
FunAudioLLM/CosyVoice
CosyVoice is an advanced multi-lingual large language model-based text-to-speech system offering state-of-the-art voice generation, cloning, and full-stack deployment capabilities.
YouMind-OpenLab/awesome-nano-banana-pro-prompts
A vast, multilingual, and open-source library of over 10,000 curated prompts for Google's Nano Banana Pro AI image generation, complete with preview images.
fishaudio/Bert-VITS2
An open-source text-to-speech model that combines the VITS2 backbone with multilingual BERT for high-quality, multi-language speech synthesis.
fishaudio/fish-speech
A state-of-the-art open-source multilingual text-to-speech system offering exceptionally natural, realistic, and emotionally rich voice generation.
WhisperSpeech/WhisperSpeech
An open-source, high-performance text-to-speech (TTS) system built by inverting OpenAI Whisper, aiming to be the Stable Diffusion for speech.
netease-youdao/EmotiVoice
EmotiVoice is an open-source, multi-voice, and prompt-controlled text-to-speech engine supporting English and Chinese with emotional synthesis capabilities.
Plachtaa/VALL-E-X
An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.
supertone-inc/supertonic
Supertonic is a lightning-fast, on-device, multilingual text-to-speech system offering high-quality audio and privacy without cloud dependencies.
OpenMOSS/MOSS-TTS-Nano
MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model optimized for real-time CPU inference and lightweight integration.