Real-time Speech-to-Text Application
4.0k 2026-04-18
collabora/WhisperLive
A real-time transcription application leveraging OpenAI's Whisper model for converting live or pre-recorded speech into text with optimized backends.
Core Features
Nearly live speech-to-text transcription.
Supports both live microphone input and pre-recorded audio files.
Multiple inference backends: Faster Whisper, NVIDIA TensorRT, and OpenVINO.
Compatible with OpenAI REST interface for transcription.
Docker support for easy server deployment.
Quick Start
pip install whisper-liveDetailed Introduction
WhisperLive is an open-source project that provides a nearly real-time implementation of OpenAI's powerful Whisper model. It enables users to transcribe spoken language into text, either from live audio streams via a microphone or from existing audio files. The project focuses on performance and flexibility, offering various optimized inference backends like Faster Whisper, NVIDIA TensorRT, and OpenVINO to cater to different hardware and performance requirements, making high-quality, low-latency transcription accessible for various applications.