AI Multimedia Processing Web Application
7.2k 2026-05-01
abus-aikorea/voice-pro
An AI-powered web application for comprehensive multimedia content creation, offering speech recognition, voice cloning, text-to-speech, and multilingual translation.
Core Features
Top-tier speech recognition (Whisper, Faster-Whisper, WhisperX)
Zero-shot voice cloning (F5-TTS, E2-TTS, CosyVoice)
Multilingual text-to-speech (Edge-TTS, kokoro)
YouTube video processing and audio extraction
Instant translation for over 100 languages
Quick Start
configure.bat && start.batDetailed Introduction
Voice-Pro is a sophisticated AI-powered web application designed to streamline multimedia content creation. It integrates advanced capabilities such as YouTube video downloading, vocal isolation, state-of-the-art speech recognition, instant multilingual translation, and versatile text-to-speech generation, including zero-shot voice cloning. Positioned as a robust alternative to commercial solutions like ElevenLabs, Voice-Pro empowers creators, researchers, and multilingual professionals with a comprehensive suite of tools for transforming audio and video content.