AI Optimization Library
1.0k 2026-04-18

intel/auto-round

AutoRound is an advanced quantization toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs), enabling high-accuracy, ultra-low-bit inference across diverse hardware.

Core Features

Superior accuracy at ultra-low bit widths (2-4 bits)
Broad hardware compatibility (CPU, XPU, CUDA)
Support for multiple data types (e.g., MXFP4, NVFP4, FP8_BLOCK, INT2)
Seamless integration with popular AI frameworks (vLLM, SGLang, Transformers)
Leverages advanced algorithms like SignRoundV1/V2 and AutoScheme for mixed-precision quantization

Detailed Introduction

AutoRound is a cutting-edge open-source library developed by Intel, designed to optimize the deployment of Large Language Models (LLMs) and Vision-Language Models (VLMs). It employs state-of-the-art quantization algorithms, including SignRoundV1 and V2, to achieve exceptional accuracy even at extremely low bit widths (2-4 bits). This enables significant reductions in memory footprint and computational cost, making LLM inference more efficient and accessible across various hardware platforms like CPUs, XPUs, and GPUs. Its deep integration with popular frameworks like vLLM, SGLang, and Hugging Face Transformers streamlines the adoption of quantized models in real-world applications.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.