AI Text-to-Speech (TTS) Model
4.2k 2026-04-18

metavoiceio/metavoice-src

MetaVoice-1B is an open-source, 1.2B parameter foundational model for highly expressive, human-like text-to-speech synthesis and zero-shot voice cloning.

Core Features

Generates emotional speech rhythm and tone in English.
Supports zero-shot voice cloning for American & British voices with 30s reference audio.
Enables cross-lingual voice cloning through finetuning with minimal data.
Synthesizes arbitrary length text efficiently.

Quick Start

docker-compose up -d ui

Detailed Introduction

MetaVoice-1B is a significant open-source initiative providing a 1.2 billion parameter text-to-speech model, trained on 100,000 hours of speech data. It excels in generating emotionally rich and natural-sounding English speech, offering advanced capabilities like zero-shot voice cloning from short audio samples and adaptable cross-lingual voice cloning through finetuning. Released under Apache 2.0, it empowers developers and creators with a powerful, unrestricted tool for high-quality, scalable voice synthesis across various applications.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.