Ecosystem & Stack: pytorch
unslothai/unsloth
Unsloth Studio is a web UI that enables efficient local training and inference of open-source large language models and other AI models with significant VRAM and speed optimizations.
modelscope/ms-swift
A scalable and lightweight infrastructure for fine-tuning, inference, and deployment of over 1000 large language models (LLMs) and multimodal large language models (MLLMs) using advanced techniques.
sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models, optimizing inference throughput and latency.
huggingface/peft
A state-of-the-art library for Parameter-Efficient Fine-Tuning (PEFT) of large pretrained models, drastically reducing computational and storage costs.
vllm-project/vllm
vLLM is a high-throughput and memory-efficient open-source library designed for fast and easy serving of large language models.
Lightning-AI/pytorch-lightning
Streamlines complex deep learning engineering, enabling scalable AI model training and finetuning across diverse hardware with minimal code changes.
InternLM/xtuner
A next-generation training engine optimized for ultra-large Mixture-of-Experts (MoE) models, offering superior efficiency and scalability.
rasbt/LLMs-from-scratch
An educational project providing step-by-step code to build a ChatGPT-like Large Language Model (LLM) from scratch using PyTorch.
p-e-w/heretic
Heretic is an AI model utility that automatically removes censorship and safety alignment from transformer-based language models without requiring expensive post-training.
adongwanai/AgentGuide
A comprehensive, job-oriented guide for AI Agent development, covering core technologies, practical projects, and interview preparation for LLM-related roles.
kvcache-ai/Mooncake
A KVCache-centric disaggregated architecture for high-performance LLM serving, powering leading AI services.
SwanHubX/SwanLab
SwanLab is an open-source, modern-design platform for tracking, visualizing, and analyzing AI/ML training experiments, supporting cloud and self-hosted deployments.
kubeflow/trainer
A Kubernetes-native platform for scalable distributed AI model training and LLM fine-tuning across various frameworks.
stas00/ml-engineering
An open collection of methodologies, tools, and step-by-step instructions for successful training, fine-tuning, and inference of large language and multi-modal models.
activeloopai/deeplake
Deep Lake is an AI data runtime and database optimized for deep learning, offering serverless multimodal data storage, scalable retrieval, and training capabilities.
tencentmusic/cube-studio
An open-source, cloud-native, all-in-one MLOps platform designed for the full lifecycle management of machine learning, deep learning, and large language model development and deployment.
bitsandbytes-foundation/bitsandbytes
A PyTorch library enabling accessible large language models through k-bit quantization, significantly reducing memory consumption for both inference and training.
ludwig-ai/ludwig
A low-code, declarative framework for building and deploying custom large language models (LLMs) and other deep neural networks with ease and efficiency.
openvinotoolkit/openvino
OpenVINO is an open-source toolkit designed to optimize and deploy deep learning models for efficient AI inference across a wide range of hardware platforms.
microsoft/AI-For-Beginners
A 12-week, 24-lesson curriculum from Microsoft to learn Artificial Intelligence for beginners, including practical lessons, quizzes, and labs.
huggingface/datasets
A lightweight library providing one-line dataloaders and efficient pre-processing tools for a vast hub of AI datasets, supporting various ML frameworks.
docarray/docarray
A Python library for representing, transmitting, storing, and retrieving multimodal data, designed for AI applications.
stochasticai/xTuring
xTuring simplifies the process of fine-tuning and deploying open-source Large Language Models (LLMs) on private data, ensuring privacy, efficiency, and scalability.
tianrun-chen/SAM-Adapter-PyTorch
A PyTorch-based framework to adapt Meta AI's Segment Anything Model (SAM) for improved performance on challenging downstream computer vision tasks using adapters and prompts.
ModelCloud/GPTQModel
A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.
X-LANCE/SLAM-LLM
A deep learning toolkit for training custom multimodal large language models focused on speech, language, audio, and music processing.
zyds/transformers-code
A comprehensive code repository accompanying a hands-on course for mastering Huggingface Transformers, covering fundamental concepts to advanced fine-tuning and deployment techniques.
lxe/simple-llm-finetuner
A beginner-friendly UI for fine-tuning large language models (LLMs) using the LoRA method on commodity NVIDIA GPUs.
mymusise/ChatGLM-Tuning
A cost-effective solution for finetuning ChatGLM-6B with LoRA, enabling personalized large language models.
adapter-hub/adapters
A unified library extending HuggingFace Transformers for parameter-efficient and modular transfer learning in NLP.
lyogavin/airllm
Optimizes large language model inference to run 70B models on a single 4GB GPU without quantization, enabling efficient deployment on resource-constrained hardware.
labmlai/annotated_deep_learning_paper_implementations
A comprehensive collection of PyTorch implementations for over 60 deep learning papers, featuring side-by-side annotated notes for enhanced understanding.
camenduru/stable-diffusion-webui-colab
Provides Google Colab notebooks for easily deploying and running Stable Diffusion WebUI, enabling AI-powered image generation and training without local hardware.
microsoft/LoRA
A Python library implementing LoRA (Low-Rank Adaptation) to efficiently fine-tune large language models by significantly reducing trainable parameters and storage requirements.
LianjiaTech/BELLE
BELLE is an open-source project dedicated to fostering the development of Chinese conversational large language models, aiming to make LLMs accessible to everyone.
wenge-research/YAYI
YaYi is an open-source Chinese Large Language Model, built on LLaMA 2 & BLOOM, designed for secure, reliable, and domain-specific applications through extensive instruction tuning.
PKU-Alignment/align-anything
A modular framework for aligning any-modality large models with human intentions and values using various fine-tuning and reinforcement learning methods.
zai-org/ImageReward
A human preference reward model for evaluating and improving text-to-image generation models.
RLHFlow/RLHF-Reward-Modeling
A comprehensive collection of recipes and code for training various reward models crucial for Reinforcement Learning from Human Feedback (RLHF) in large language models.
OpenLMLab/MOSS-RLHF
An open-source framework providing code, models, and insights for stable Reinforcement Learning from Human Feedback (RLHF) training in Large Language Models, focusing on the PPO algorithm and reward modeling.
huggingface/diffusers
A modular PyTorch library for state-of-the-art diffusion models, enabling easy inference and training for image, video, and audio generation.
Sanster/IOPaint
An open-source, AI-driven tool for advanced image inpainting, outpainting, object removal, and replacement using state-of-the-art models.
carson-katri/dream-textures
Integrates Stable Diffusion directly into Blender for seamless AI-powered texture generation, concept art creation, and image manipulation within 3D workflows.
ashawkey/stable-dreamfusion
A PyTorch implementation for generating 3D models from text or images, leveraging NeRF and diffusion models like Stable Diffusion.
kuprel/min-dalle
A fast, minimal PyTorch port of DALL·E Mini for efficient text-to-image generation.
lucidrains/imagen-pytorch
A PyTorch implementation of Google's Imagen, a state-of-the-art text-to-image neural network, enabling advanced generative AI capabilities.
lucidrains/DALLE2-pytorch
A PyTorch implementation of OpenAI's DALL-E 2, a state-of-the-art neural network for text-to-image synthesis.
lucidrains/DALLE-pytorch
An open-source PyTorch implementation and replication of OpenAI's DALL-E, a text-to-image transformer, including CLIP for generation ranking.
XavierXiao/Dreambooth-Stable-Diffusion
This project implements Google's Dreambooth technique on Stable Diffusion, enabling users to fine-tune a text-to-image model with a few custom examples for personalized image generation.
NVIDIA-NeMo/NeMo
A scalable generative AI framework for researchers and developers focused on Large Language Models, Multimodal, and Speech AI (ASR, TTS).
snakers4/silero-models
A collection of pre-trained, end-to-end text-to-speech models designed for simplicity, speed, and natural-sounding speech across multiple languages.
fishaudio/Bert-VITS2
An open-source text-to-speech model that combines the VITS2 backbone with multilingual BERT for high-quality, multi-language speech synthesis.
denizsafak/abogen
Generate high-quality audiobooks and voiceovers from various text formats with synchronized captions.
babysor/MockingBird
A real-time voice cloning toolkit that allows users to replicate a voice in 5 seconds and generate arbitrary speech.
santinic/audiblez
A Python-based tool to convert e-books (EPUB) into high-quality M4B audiobooks using advanced text-to-speech models.
RVC-Boss/GPT-SoVITS
A powerful web-based tool for few-shot voice cloning and text-to-speech, enabling high-quality voice generation from minimal audio data.
remsky/Kokoro-FastAPI
A Dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model, offering multi-language support, CPU/GPU inference, and an OpenAI-compatible API.
pixeltable/pixeltable
A declarative, transactional Python library for building multimodal AI applications with incremental data storage, transformation, indexing, and orchestration.
EvolvingLMMs-Lab/lmms-eval
A unified, reproducible, and efficient multimodal evaluation toolkit for large language models across text, image, video, and audio tasks.
OpenMOSS/MOSS-TTS
An open-source AI model family for high-fidelity, expressive speech and sound generation across diverse real-world applications.
kyegomez/BitNet
A PyTorch implementation of BitNet, enabling highly efficient 1-bit transformers for large language models.
facebookresearch/mmf
A modular and scalable PyTorch-based framework for state-of-the-art vision and language multimodal research from Facebook AI Research.
emcf/thepipe
A Python library for extracting clean markdown, multimodal media, and structured data from complex documents using vision-language models.
ZhaoJ9014/face.evoLVe
A high-performance, comprehensive face recognition library built on PaddlePaddle and PyTorch.
dbiir/UER-py
An open-source PyTorch-based framework for NLP pre-training and fine-tuning, offering modularity, reproducibility, and a comprehensive model zoo for various downstream tasks.
liucongg/ChatGLM-Finetuning
A toolkit for finetuning ChatGLM series models (ChatGLM-6B, ChatGLM2-6B, ChatGLM3-6B) using various methods like Freeze, Lora, P-tuning, and full parameter training for downstream NLP tasks.
PhoebusSi/Alpaca-CoT
A unified platform simplifying instruction-tuning for Large Language Models by integrating diverse data, LLMs, and parameter-efficient methods.
myshell-ai/OpenVoice
An open-source AI model for instant, accurate, and flexible voice cloning, supporting cross-lingual synthesis and granular style control.
MoonInTheRiver/DiffSinger
DiffSinger is an official PyTorch implementation of a singing voice synthesis (SVS) and text-to-speech (TTS) system, leveraging a shallow diffusion mechanism for high-quality audio generation.
netease-youdao/EmotiVoice
EmotiVoice is an open-source, multi-voice, and prompt-controlled text-to-speech engine supporting English and Chinese with emotional synthesis capabilities.
yl4579/StyleTTS2
StyleTTS 2 is a cutting-edge text-to-speech model achieving human-level speech synthesis through style diffusion and adversarial training with large speech language models.
metavoiceio/metavoice-src
MetaVoice-1B is an open-source 1.2B parameter foundational model for highly expressive, human-like text-to-speech synthesis with advanced voice cloning capabilities.
Plachtaa/VALL-E-X
An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.
mozilla/TTS
A deep learning library for advanced, high-quality, and efficient Text-to-Speech (TTS) synthesis, supporting multiple languages and models.
KohakuBlueleaf/LyCORIS
LyCORIS is a library implementing various parameter-efficient fine-tuning (PEFT) algorithms for Stable Diffusion, extending beyond conventional LoRA methods to enhance model adaptation.
nateraw/stable-diffusion-videos
Create dynamic and visually captivating videos by smoothly morphing between different text prompts using Stable Diffusion.
numz/ComfyUI-SeedVR2_VideoUpscaler
Official SeedVR2 Video Upscaler for ComfyUI, enabling high-quality video and image upscaling, also runnable as a standalone CLI.
kyegomez/OpenMythos
An open-source, theoretical reconstruction of the Claude Mythos LLM architecture, featuring a Recurrent-Depth Transformer and sparse Mixture of Experts for advanced reasoning.
ashleve/lightning-hydra-template
A user-friendly template integrating PyTorch Lightning and Hydra to streamline deep learning experimentation and development.
GeeeekExplorer/nano-vllm
A lightweight and optimized Python library for fast offline large language model inference, offering comparable or better performance than vLLM with a more readable codebase.
SylphAI-Inc/AdalFlow
AdalFlow is a PyTorch-like open-source library designed to build and automatically optimize large language model (LLM) applications, from chatbots and RAG systems to complex AI agents.
Lightning-AI/litgpt
A high-performance, no-abstraction toolkit providing recipes for pretraining, finetuning, and deploying over 20 large language models at scale.
open-mmlab/mmpretrain
MMPreTrain is an OpenMMLab project providing a comprehensive, open-source PyTorch-based toolbox for pre-training and benchmarking various computer vision and multi-modal models.
microsoft/torchscale
A PyTorch library providing advanced foundation architectures to efficiently and effectively scale Transformers for large language models and general-purpose AI.
jina-ai/discoart
Create stunning Disco Diffusion artworks with a single line of Python code, offering a professional API and robust integration capabilities.
riffusion/riffusion-hobby
A library for real-time music and audio generation leveraging stable diffusion, offering CLI, interactive app, and API capabilities.
OpenMOSS/MOSS-TTS-Nano
MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model optimized for real-time CPU inference and lightweight integration.