Ecosystem & Stack: gpu
vllm-project/vllm
A high-throughput and memory-efficient open-source engine designed for fast, easy, and cost-effective serving of large language models.
OpenPipe/ART
An open-source framework for training multi-step LLM agents using reinforcement learning (GRPO) to learn from experience, offering a serverless RL training service.
axolotl-ai-cloud/axolotl
A free and open-source framework designed for efficient and flexible fine-tuning of large language models.
containers/ramalama
RamaLama simplifies the local serving and production inference of AI models from any source by leveraging familiar container patterns, eliminating complex host system configurations.
LMCache/LMCache
LMCache is an LLM serving engine extension designed to significantly reduce Time-To-First-Token (TTFT) and boost throughput, especially for long-context scenarios, by intelligently reusing KV caches.
katanaml/sparrow
Sparrow is a production-ready platform for structured data extraction and instruction calling from various documents and images using ML, LLM, and Vision LLM technologies.
NVIDIA-NeMo/Curator
A GPU-accelerated, scalable toolkit for multimodal data preprocessing and curation, designed to train better AI models faster.
AI-Hypercomputer/maxtext
A high-performance, scalable JAX-based open-source library for training large language models on Google Cloud TPUs and GPUs.
stochasticai/xTuring
xTuring simplifies the fine-tuning, evaluation, and deployment of open-source Large Language Models (LLMs) on private data, ensuring privacy and efficiency.
Docta-ai/docta
Docta is an advanced data-centric AI platform that detects and rectifies issues in various data types to improve model performance.
XavierXiao/Dreambooth-Stable-Diffusion
An implementation of Google's Dreambooth technique on Stable Diffusion, enabling personalized text-to-image model fine-tuning with limited examples.
fikrikarim/parlor
Parlor is an on-device, real-time multimodal AI that enables natural voice and vision conversations, running entirely on your local machine.
NexaAI/nexa-sdk
A high-performance local inference framework for running frontier multimodal AI models on various devices with minimal energy consumption.