AI Voice Studio Application
24.7k 2026-05-06
jamiepine/voicebox
An open-source, local-first AI voice studio offering voice cloning, speech generation, and dictation with complete privacy.
Core Features
Complete privacy with local execution of all models and data.
Zero-shot voice cloning and over 50 preset voices across 7 TTS engines.
Support for 23 languages and advanced post-processing effects.
Global dictation hotkey, in-app mic, and Whisper-based STT for voice input.
API-first design for integration with custom apps and AI agents.
Detailed Introduction
Voicebox is a comprehensive, open-source AI voice studio designed to run entirely on your local machine, ensuring complete privacy. It serves as a powerful alternative to cloud-based solutions like ElevenLabs and WisprFlow, integrating both voice output (cloning, speech generation) and voice input (dictation, STT) capabilities. With support for multiple TTS engines, 23 languages, and advanced features like voice cloning, post-processing, and agent integration, Voicebox empowers users to create, dictate, and interact with AI using custom voices, all within a native, high-performance application.