Ecosystem & Stack: huggingface
docling-project/docling
Docling simplifies document processing, parsing diverse formats including advanced PDF understanding, and provides seamless integrations with the generative AI ecosystem.
vllm-project/vllm
vLLM is a high-throughput and memory-efficient open-source library designed for fast and easy serving of large language models.
X-PLUG/MobileAgent
A powerful multi-platform GUI agent family for automating desktop, mobile, and browser interactions using large language models.
icip-cas/PPTAgent
An agentic framework leveraging AI to autonomously generate and refine professional PowerPoint presentations with deep research integration and visual design capabilities.
kubeflow/trainer
A Kubernetes-native platform for scalable distributed AI model training and LLM fine-tuning across various frameworks.
PacktPublishing/LLM-Engineers-Handbook
A comprehensive practical guide and accompanying code repository for LLM engineers, covering the full lifecycle of building, deploying, and monitoring advanced LLM and RAG applications on AWS with LLMOps best practices.
bespokelabsai/curator
A Python library for generating and curating high-quality synthetic data for AI model training and structured data extraction.
ModelCloud/GPTQModel
A toolkit for quantizing (compressing) Large Language Models (LLMs) with hardware acceleration across various GPUs and CPUs, integrating with popular inference frameworks.
JIA-Lab-research/LongLoRA
LongLoRA is an efficient fine-tuning method and associated models/datasets designed to extend the context window of Large Language Models (LLMs) for processing longer inputs.
PKU-Alignment/align-anything
A modular framework for aligning any-modality large models with human intentions and values using various fine-tuning and reinforcement learning methods.
zai-org/ImageReward
A human preference reward model for evaluating and improving text-to-image generation models.
AI4Finance-Foundation/FinGPT
FinGPT democratizes access to large language models tailored for finance, offering cost-effective and rapidly adaptable solutions to overcome the limitations of proprietary financial AI.
ashawkey/stable-dreamfusion
A PyTorch implementation for generating 3D models from text or images, leveraging NeRF and diffusion models like Stable Diffusion.
Hunyuan-PromptEnhancer/PromptEnhancer
A prompt rewriting tool that refines user prompts into clearer, structured versions to enhance the quality of text-to-image generation and image-to-image editing.
XavierXiao/Dreambooth-Stable-Diffusion
This project implements Google's Dreambooth technique on Stable Diffusion, enabling users to fine-tune a text-to-image model with a few custom examples for personalized image generation.
EvolvingLMMs-Lab/lmms-eval
A unified, reproducible, and efficient multimodal evaluation toolkit for large language models across text, image, video, and audio tasks.
2U1/Qwen-VL-Series-Finetune
An open-source implementation for efficiently fine-tuning Alibaba Cloud's Qwen-VL series of multimodal large language models using HuggingFace and Liger-Kernel.
sentient-agi/OML-1.0-Fingerprinting
A framework for embedding secret cryptographic fingerprints into Large Language Models (LLMs) via fine-tuning to verify ownership and prevent unauthorized use.
PhoebusSi/Alpaca-CoT
A unified platform simplifying instruction-tuning for Large Language Models by integrating diverse data, LLMs, and parameter-efficient methods.
X-PLUG/mPLUG-DocOwl
A modularized multimodal large language model designed for OCR-free document understanding.
OpenGVLab/InternVL
A pioneering open-source multimodal large language model family aiming to match or exceed commercial models like GPT-4o/GPT-5 in performance.
OpenMOSS/MOSS-TTS-Nano
MOSS-TTS-Nano is an open-source, multilingual, tiny speech generation model optimized for real-time CPU inference and lightweight integration.