Speech AI Toolkit
12.6k 2026-05-01
PaddlePaddle/PaddleSpeech
An easy-to-use open-source toolkit built on PaddlePaddle, offering state-of-the-art models for diverse speech and audio tasks like ASR, TTS, translation, and speaker verification.
Core Features
State-of-the-art and Streaming Automatic Speech Recognition (ASR)
Streaming Text-to-Speech (TTS) with text frontend
End-to-End Speech Translation
Speaker Verification System
Keyword Spotting and Self-Supervised Learning models
Detailed Introduction
PaddleSpeech is a comprehensive open-source toolkit leveraging the PaddlePaddle platform to address critical speech and audio tasks. It provides state-of-the-art and influential models for functionalities such as speech recognition, text-to-speech synthesis, speech translation, speaker verification, and keyword spotting. Recognized with the NAACL2022 Best Demo Award, PaddleSpeech aims to offer an easy-to-use and powerful solution for researchers and developers in the speech AI domain, enabling rapid development and deployment of advanced speech applications.