AI Model Implementation / Text-to-Speech System
7.9k 2026-05-05
Plachtaa/VALL-E-X
An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.
Core Features
Multilingual TTS (English, Chinese, Japanese)
Zero-shot Voice Cloning from short audio prompts
Speech Emotion Control
Zero-shot Cross-Lingual Speech
Quick Start
git clone https://github.com/Plachtaa/VALL-E-X.git && cd VALL-E-X && pip install -r requirements.txtDetailed Introduction
VALL-E X is an advanced open-source project that reproduces Microsoft's cutting-edge zero-shot Text-to-Speech (TTS) model. It empowers users to generate natural, expressive speech in multiple languages and clone voices from brief audio samples. By making this powerful AI model accessible, it democratizes advanced voice synthesis capabilities for research and application development, overcoming the initial lack of public code or models from Microsoft.