Tags: #quantization

LLM Fine-tuning Framework

11.8k

axolotl-ai-cloud/axolotl

A free and open-source framework designed for efficient fine-tuning of large language models.

llm fine-tuning deep-learning

Details

Curated Resource List

Python

5.2k

xlite-dev/Awesome-LLM-Inference

A comprehensive, curated list of research papers and associated code implementations focused on optimizing Large Language Model (LLM) and Vision-Language Model (VLM) inference.

llm inference vlm inference optimization

Details

Deep Learning Library

python

8.2k

bitsandbytes-foundation/bitsandbytes

A PyTorch library enabling accessible large language models through k-bit quantization, significantly reducing memory consumption for both inference and training.

pytorch quantization llm

Details

LLM Fine-tuning Framework

Python

2.7k

stochasticai/xTuring

xTuring simplifies the process of fine-tuning and deploying open-source Large Language Models (LLMs) on private data, ensuring privacy, efficiency, and scalability.

llm-fine-tuning open-source-llms private-llms

Details

LLM Optimization Toolkit

huggingface

1.1k

ModelCloud/GPTQModel

A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.

llm quantization compression

Details

AI Inference Engine

3.8k

nunchaku-ai/nunchaku

Nunchaku is a high-performance AI inference engine that optimizes 4-bit neural networks, especially diffusion models, for faster and more memory-efficient execution.

ai-inference quantization diffusion-models

Details

Deep Learning Library

pytorch

1.9k

kyegomez/BitNet

A PyTorch implementation of BitNet, enabling highly efficient 1-bit transformers for large language models.

pytorch llm quantization

Details

AI Optimization Library

python

1.0k

intel/auto-round

AutoRound is an advanced quantization toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs), enabling high-accuracy, ultra-low-bit inference across diverse hardware.

llm quantization deep-learning

Details

ComfyUI Plugin

comfyui

2.9k

nunchaku-ai/ComfyUI-nunchaku

A ComfyUI plugin that integrates Nunchaku, an efficient inference engine for 4-bit quantized neural networks, to accelerate AI model execution.

comfyui ai-inference quantization

Details