Tags: #gpu-optimization

AI Development & Deployment Platform
pip
25.9k

modular/modular

A unified, open platform for accelerating AI model serving and scaling GenAI deployments with industry-leading performance across various hardware.

Large Language Model Training Tool
Hugging Face
6.7k

yangjianxin1/Firefly

Firefly is an open-source toolkit for efficient large language model training, supporting pre-training, instruction fine-tuning, and DPO with methods like LoRA and QLoRA.

LLM Inference Optimization Library
Python
16.4k

lyogavin/airllm

AirLLM optimizes large language model inference memory, enabling 70B LLMs on a single 4GB GPU without quantization, and 405B Llama3.1 on 8GB VRAM.

LLM Inference Server
Docker
3.8k

predibase/lorax

A multi-LoRA inference server designed to efficiently serve thousands of fine-tuned Large Language Models on a single GPU, drastically cutting serving costs while maintaining high throughput and low latency.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.