intel/auto-round
AutoRound is an advanced quantization toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs), enabling high-accuracy, ultra-low-bit inference across diverse hardware.
Core Features
Detailed Introduction
AutoRound is a cutting-edge open-source library developed by Intel, designed to optimize the deployment of Large Language Models (LLMs) and Vision-Language Models (VLMs). It employs state-of-the-art quantization algorithms, including SignRoundV1 and V2, to achieve exceptional accuracy even at extremely low bit widths (2-4 bits). This enables significant reductions in memory footprint and computational cost, making LLM inference more efficient and accessible across various hardware platforms like CPUs, XPUs, and GPUs. Its deep integration with popular frameworks like vLLM, SGLang, and Hugging Face Transformers streamlines the adoption of quantized models in real-world applications.