Conversational AI Framework
10.4k 2026-04-12
TEN-framework/ten-framework
An open-source framework for building real-time multimodal conversational AI agents.
Core Features
Low-latency, high-quality real-time voice assistants with RTC/WebSocket support
Real-time speaker diarization for detecting and labeling speakers
Lip-sync avatar integration with multiple vendors (e.g., MotionSync, Trulience)
SIP call integration enabling phone calls powered by TEN agents
Transcription tool for converting audio to text
Detailed Introduction
TEN is an open-source framework designed for developing real-time multimodal conversational AI agents. It provides a robust foundation for creating advanced voice AI applications, supporting features like low-latency voice assistants, speaker diarization, and lip-sync avatar integration. The framework is highly extensible, allowing developers to build sophisticated AI interactions across various platforms, including hardware integrations like ESP32-S3, making it a versatile tool for next-generation AI development.