Tags: #deep-learning
huggingface/transformers
A unified framework providing state-of-the-art machine learning models for text, vision, audio, and multimodal tasks, optimized for both inference and training.
modelscope/ms-swift
A comprehensive framework from ModelScope for efficiently fine-tuning, evaluating, and deploying over 1000 large language models and multimodal large models using advanced techniques.
huggingface/peft
PEFT is a state-of-the-art library for Parameter-Efficient Fine-Tuning, drastically reducing the computational and storage costs of adapting large pretrained models.
vllm-project/vllm
A high-throughput and memory-efficient open-source engine designed for fast, easy, and cost-effective serving of large language models.
hiyouga/LlamaFactory
A unified and efficient framework for fine-tuning over 100 large language models (LLMs) and vision-language models (VLMs) with both CLI and Web UI.
Lightning-AI/pytorch-lightning
A deep learning framework that simplifies PyTorch development by automating boilerplate engineering code, enabling scalable training from CPU to multi-node GPUs with minimal code changes.
alibaba/MNN
A blazing-fast, lightweight inference engine from Alibaba, powering high-performance on-device LLMs and Edge AI.
axolotl-ai-cloud/axolotl
A free and open-source framework designed for efficient and flexible fine-tuning of large language models.
InternLM/xtuner
XTuner V1 is a next-generation training engine specifically designed for ultra-large-scale Mixture-of-Experts (MoE) models, offering superior efficiency and scalability.
rasbt/LLMs-from-scratch
A comprehensive, step-by-step guide and codebase for building a ChatGPT-like Large Language Model from scratch using PyTorch.
lutzroeder/netron
A universal viewer for neural network, deep learning, and machine learning models, supporting a wide array of formats.
luhengshiwo/LLMForEverybody
A comprehensive learning platform offering structured knowledge, interview questions, and paper analysis for Large Language Models (LLMs).
alvinunreal/awesome-opensource-ai
A meticulously curated list of battle-tested, production-proven open-source AI models, libraries, infrastructure, and developer tools.
polyaxon/polyaxon
A comprehensive MLOps platform for managing, orchestrating, and scaling the entire machine learning and deep learning lifecycle.
activeloopai/deeplake
Deep Lake is an AI data runtime and database optimized for deep learning, offering multimodal data storage, querying, vector search, and streaming for LLM and deep learning applications.
tencentmusic/cube-studio
A comprehensive, cloud-native, one-stop platform for machine learning, deep learning, and large language model development, covering the entire MLOps lifecycle.
bitsandbytes-foundation/bitsandbytes
A PyTorch library enabling accessible large language models by dramatically reducing memory consumption through k-bit quantization for both inference and training.
ludwig-ai/ludwig
Ludwig is a low-code, declarative deep learning framework designed to simplify the building, training, and deployment of custom AI models, including LLMs and neural networks.
alvinreal/awesome-opensource-ai
A meticulously curated list of battle-tested, production-proven open-source AI models, libraries, infrastructure, and developer tools.
AccumulateMore/CV
A comprehensive and curated collection of deep learning study notes, integrating content from leading educators like Andrew Ng, Li Mu, and TuDui, covering CV, NLP, and Large Language Models.
openvinotoolkit/openvino
OpenVINO is an open-source toolkit designed to optimize and deploy deep learning models for efficient AI inference across diverse hardware platforms, from edge to cloud.
microsoft/AI-For-Beginners
A comprehensive 12-week, 24-lesson curriculum designed to introduce beginners to the fundamentals of Artificial Intelligence.
spmallick/learnopencv
A comprehensive repository offering C++ and Python code examples for computer vision, deep learning, and AI research, complementing articles on LearnOpenCV.com.
recommenders-team/recommenders
A comprehensive toolkit providing best practices and implementations for building, evaluating, and operationalizing recommendation systems.
hpcaitech/ColossalAI
Colossal-AI makes training and deploying large AI models cheaper, faster, and more accessible through advanced distributed training techniques.
AI-Hypercomputer/maxtext
A high-performance, scalable JAX-based open-source library for training large language models on Google Cloud TPUs and GPUs.
h2oai/h2o-llmstudio
A no-code GUI and framework for easily fine-tuning state-of-the-art large language models (LLMs).
tianrun-chen/SAM-Adapter-PyTorch
A PyTorch-based framework to adapt Meta AI's Segment Anything Model (SAM) for improved performance on challenging downstream computer vision tasks using adapters and prompts.
LLMBook-zh/LLMBook-zh.github.io
A comprehensive Chinese technical book and associated course materials providing a systematic framework and roadmap for understanding Large Language Models, authored by leading experts.
ModelCloud/GPTQModel
A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.
X-LANCE/SLAM-LLM
A deep learning toolkit for training custom multimodal large language models focused on speech, language, audio, and music processing.
zyds/transformers-code
A comprehensive code repository accompanying a hands-on course for mastering Huggingface Transformers, covering fundamental concepts to advanced fine-tuning and deployment techniques.
lyogavin/airllm
AirLLM optimizes large language model inference memory, enabling 70B LLMs on a single 4GB GPU without quantization, and 405B Llama3.1 on 8GB VRAM.
labmlai/annotated_deep_learning_paper_implementations
A comprehensive collection of PyTorch implementations for over 60 deep learning papers, accompanied by detailed side-by-side notes for enhanced understanding.
microsoft/LoRA
A PyTorch library implementing LoRA (Low-Rank Adaptation) to efficiently fine-tune large language models by significantly reducing trainable parameters and storage requirements.
JIA-Lab-research/LongLoRA
LongLoRA is an efficient fine-tuning method and associated models/datasets designed to extend the context window of Large Language Models (LLMs) for processing longer inputs.
InternLM/InternLM
A series of high-performance, cost-efficient large language models (LLMs) designed for general-purpose usage and advanced reasoning.
RLHFlow/RLHF-Reward-Modeling
A comprehensive collection of recipes and code for training various reward models crucial for Reinforcement Learning from Human Feedback (RLHF) in large language models.
OpenLMLab/MOSS-RLHF
An open-source framework providing code, models, and insights for stable Reinforcement Learning from Human Feedback (RLHF) training in Large Language Models, focusing on the PPO algorithm and reward modeling.
kuprel/min-dalle
A fast, minimal PyTorch port of DALL·E Mini, optimized for efficient text-to-image generation inference.
lucidrains/imagen-pytorch
A PyTorch implementation of Google's Imagen, a state-of-the-art text-to-image neural network that surpasses DALL-E2 in synthesis quality.
lucidrains/DALLE2-pytorch
A PyTorch implementation of OpenAI's DALL-E 2, enabling advanced text-to-image synthesis through a diffusion-based neural network architecture.
lucidrains/DALLE-pytorch
An open-source PyTorch implementation of OpenAI's DALL-E, a text-to-image transformer, including CLIP for generation ranking.
lucidrains/deep-daze
A simple command-line tool for generating artistic images from text descriptions using OpenAI's CLIP and Siren neural networks.
fishaudio/fish-speech
A state-of-the-art open-source multilingual text-to-speech system offering natural, expressive, and emotionally rich voice generation.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time using a three-stage deep learning framework.
babysor/MockingBird
A powerful open-source toolkit for real-time voice cloning and arbitrary speech generation from text.
vllm-project/vllm-omni
vLLM-Omni is an efficient, flexible, and easy-to-use framework extending vLLM to serve omni-modality models (text, image, video, audio) with high throughput and an OpenAI-compatible API.
EvolvingLMMs-Lab/lmms-eval
A unified, reproducible, and efficient multimodal evaluation toolkit for large language models across text, image, video, and audio tasks.
kyegomez/BitNet
A PyTorch implementation of BitNet, enabling highly efficient 1-bit transformers for large language models.
2U1/Qwen-VL-Series-Finetune
An open-source implementation for efficiently fine-tuning Alibaba Cloud's Qwen-VL series of multimodal large language models using HuggingFace and Liger-Kernel.
facebookresearch/mmf
A modular PyTorch-based framework from Facebook AI Research for state-of-the-art vision and language multimodal AI research.
OpenGVLab/InternVideo
A series of video foundation models and large-scale datasets designed for comprehensive multimodal video understanding and generation.
intel/auto-round
AutoRound is an advanced quantization toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs), enabling high-accuracy, ultra-low-bit inference across diverse hardware.
ZhaoJ9014/face.evoLVe
A high-performance, comprehensive face recognition library built on PaddlePaddle and PyTorch.
myshell-ai/OpenVoice
An AI voice synthesis library offering instant, accurate, and flexible voice cloning with multi-lingual support.
coqui-ai/TTS
A deep learning toolkit for Text-to-Speech, offering pretrained models, training tools, and dataset utilities.
netease-youdao/EmotiVoice
An open-source, multi-voice, and prompt-controlled text-to-speech engine capable of generating speech with diverse emotions in English and Chinese.
yl4579/StyleTTS2
StyleTTS 2 is a text-to-speech model that achieves human-level speech synthesis by leveraging style diffusion and adversarial training with large speech language models.
metavoiceio/metavoice-src
MetaVoice-1B is an open-source, 1.2B parameter foundational model for highly expressive, human-like text-to-speech synthesis and zero-shot voice cloning.
TensorSpeech/TensorFlowTTS
TensorFlowTTS provides real-time, state-of-the-art speech synthesis architectures based on TensorFlow 2, supporting multiple languages and optimized for fast inference and deployment on various devices.
Plachtaa/VALL-E-X
An open-source implementation of Microsoft's VALL-E X, enabling zero-shot multilingual text-to-speech synthesis and voice cloning with emotion control.
jaywalnut310/vits
VITS is an end-to-end text-to-speech model that generates highly natural-sounding audio with diverse rhythms, outperforming traditional two-stage TTS systems.
mozilla/TTS
A deep learning library for advanced Text-to-Speech generation, offering high-quality speech synthesis with pretrained models and multi-language support.
X-PLUG/mPLUG-DocOwl
A modularized multimodal large language model designed for OCR-free document understanding.
NExT-GPT/NExT-GPT
The first end-to-end multimodal large language model (MM-LLM) that perceives input and generates output in arbitrary combinations (any-to-any) of text, image, video, and audio.
X-PLUG/mPLUG-Owl
A family of powerful multi-modal large language models (MLLMs) designed to advance AI's understanding and generation capabilities across various data types.
WeThinkIn/AIGC-Interview-Book
A comprehensive interview preparation guide for AIGC and AI algorithm/development roles, covering a wide range of AI technologies and career insights.
FurkanGozukara/Stable-Diffusion
A comprehensive repository offering expert-level tutorials, guides, and courses on various Generative AI technologies, primarily focusing on Stable Diffusion and its ecosystem.
xlite-dev/lite.ai.toolkit
A lightweight C++ toolkit for deploying over 100 diverse AI models with multiple inference engines.
nateraw/stable-diffusion-videos
Create dynamic videos by smoothly transitioning between text prompts using Stable Diffusion's latent space exploration.
numz/ComfyUI-SeedVR2_VideoUpscaler
Official SeedVR2 Video Upscaler for ComfyUI, enabling high-quality video and image upscaling, also runnable as a standalone CLI.
deepfakes/faceswap
An open-source deep learning tool that enables users to recognize and swap faces in images and videos.
GaParmar/img2img-turbo
A one-step image-to-image translation framework leveraging Stable Diffusion Turbo for rapid generation across various tasks like sketch-to-image and day-to-night transformations.
Fanghua-Yu/SUPIR
SUPIR is an AI-driven project focused on developing practical algorithms for photo-realistic image restoration and upscaling in real-world scenarios.