Ecosystem & Stack: gpu

LLM Inference and Serving Engine

76.3k

vllm-project/vllm

A high-throughput and memory-efficient open-source engine designed for fast, easy, and cost-effective serving of large language models.

llm inference llm serving deep learning

Details

Reinforcement Learning Framework

python

9.2k

OpenPipe/ART

An open-source framework for training multi-step LLM agents using reinforcement learning (GRPO) to learn from experience, offering a serverless RL training service.

reinforcement-learning llm-agents machine-learning

Details

LLM Fine-tuning Framework

python

11.7k

axolotl-ai-cloud/axolotl

A free and open-source framework designed for efficient and flexible fine-tuning of large language models.

llm fine-tuning deep-learning

Details

Developer Tool for AI Model Serving

docker

2.7k

containers/ramalama

RamaLama simplifies the local serving and production inference of AI models from any source by leveraging familiar container patterns, eliminating complex host system configurations.

ai containers inference

Details

LLM Inference Optimization Engine

GPU

8.0k

LMCache/LMCache

LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput, especially for long-context scenarios, by intelligently reusing KV caches.

llm kv-cache inference-optimization

Details

AI-powered Document Processing Platform

python

5.2k

katanaml/sparrow

Sparrow is a production-ready platform for structured data extraction and instruction calling from various documents and images using ML, LLM, and Vision LLM technologies.

structured-data-extraction llm vision-llm

Details

AI Data Curation Toolkit

NVIDIA NeMo

1.5k

NVIDIA-NeMo/Curator

A GPU-accelerated, scalable toolkit for multimodal data preprocessing and curation, designed to train better AI models faster.

llm data-curation gpu-acceleration

Details

LLM Training Framework

python

2.2k

AI-Hypercomputer/maxtext

A high-performance, scalable JAX-based open-source library for training large language models on Google Cloud TPUs and GPUs.

jax llm deep-learning

Details

LLM Fine-tuning Framework

Python

2.7k

stochasticai/xTuring

xTuring simplifies the fine-tuning, evaluation, and deployment of open-source Large Language Models (LLMs) on private data, ensuring privacy and efficiency.

llm fine-tuning private llms quantization

Details

AI Data Curation Platform

python

3.5k