OSS Alternative - Discover Top Open Source Alternatives to Popular Software

RVC-Boss/GPT-SoVITS

A powerful web-based tool for few-shot voice cloning and text-to-speech, enabling high-quality voice generation from minimal audio data.

Core Features

Zero-shot Text-to-Speech with 5-second audio samples.

Few-shot Text-to-Speech, fine-tuning with just 1 minute of voice data.

Cross-lingual support for English, Japanese, Korean, Cantonese, and Chinese.

Integrated WebUI tools for dataset creation, including voice separation and ASR.

Quick Start

conda create -n GPTSoVits python=3.10 && conda activate GPTSoVits && pwsh -F install.ps1 --Device CU128 --Source HF

Detailed Introduction

GPT-SoVITS-WebUI is an advanced open-source project offering robust few-shot voice conversion and text-to-speech capabilities through a user-friendly web interface. It stands out by enabling high-fidelity voice cloning and TTS with remarkably minimal audio input, requiring as little as one minute of training data for fine-tuning, or even performing zero-shot inference from a mere 5-second vocal sample. The project integrates essential tools for preparing training datasets, such as voice accompaniment separation and automatic segmentation, making it highly accessible for both beginners and experienced users to create custom AI voices across multiple languages, including English, Japanese, Korean, Cantonese, and Chinese.