OSS Alternative - Discover Top Open Source Alternatives to Popular Software

MoonInTheRiver/DiffSinger

DiffSinger is an official PyTorch implementation of a singing voice synthesis (SVS) and text-to-speech (TTS) system, leveraging a shallow diffusion mechanism for high-quality audio generation.

Core Features

High-quality Singing Voice Synthesis (SVS)

Advanced Text-to-Speech (TTS) capabilities

Utilizes a novel Shallow Diffusion Mechanism

Official implementation of an AAAI 2022 research paper

Supports various SVS pipelines including MIDI and F0 inputs

Detailed Introduction

DiffSinger is the official PyTorch implementation of the AAAI 2022 paper 'DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism'. It provides a robust framework for both singing voice synthesis (SVS) and text-to-speech (TTS), employing an innovative shallow diffusion model to generate high-fidelity audio. The project offers flexible pipelines, supporting inputs like lyrics, MIDI, and F0, and integrates with vocoders like HiFiGAN. It's a valuable resource for researchers and developers in AI audio generation.