On-device AI Inference SDK
8.0k 2026-05-01
qualcomm/nexa-sdk
A high-performance SDK enabling day-0 local inference of frontier LLMs and VLMs across diverse hardware (NPU, GPU, CPU) and platforms (PC, mobile, IoT) with minimal energy.
Core Features
Day-0 support for cutting-edge LLMs and VLMs.
Optimized local inference on NPU, GPU, and CPU.
Broad cross-platform compatibility (Windows, Linux, Android, iOS, IoT).
High performance and energy efficiency for on-device AI.
Comprehensive SDKs for CLI, Python, Android, and Docker.
Detailed Introduction
NexaSDK is a highly performant local inference framework designed to bring the smartest and fastest AI capabilities directly to devices with minimum energy consumption. It provides day-0 support for the latest multimodal AI models, often weeks or months ahead of competitors, and optimizes their execution across NPUs, GPUs, and CPUs. With comprehensive runtime coverage for PC, mobile, and IoT platforms, NexaSDK empowers developers to build advanced on-device AI applications efficiently and broadly.