NVIDIA-NeMo/NeMo
A scalable generative AI framework for researchers and developers focused on Large Language Models, Multimodal, and Speech AI (ASR, TTS).
Core Features
Quick Start
pip install nemo-toolkitDetailed Introduction
NVIDIA NeMo Speech is a robust, scalable generative AI framework tailored for researchers and PyTorch developers specializing in speech AI, including Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and Speech Large Language Models (LLMs). It empowers users to efficiently build, customize, and deploy advanced AI models by leveraging a rich collection of existing code and pre-trained checkpoints. The framework emphasizes performance, offering features like low-latency streaming inference and multilingual capabilities, making it a powerful tool for developing cutting-edge conversational AI and speech processing applications.