Tags: #vlm
hiyouga/LlamaFactory
A unified and efficient framework for fine-tuning over 100 large language models (LLMs) and vision-language models (VLMs) with both CLI and Web UI.
PaddlePaddle/FastDeploy
A high-performance inference and deployment toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs) based on PaddlePaddle.
OpenRLHF/OpenRLHF
An easy-to-use, scalable, and high-performance open-source framework for Reinforcement Learning from Human Feedback (RLHF), leveraging Ray and vLLM for distributed training of LLMs and VLMs.
heshengtao/comfyui_LLM_party
A ComfyUI-based framework for building comprehensive LLM agent workflows, integrating diverse AI models, tools, and social platforms.
stas00/ml-engineering
An open collection of methodologies, tools, and step-by-step instructions for successful training, fine-tuning, and inference of large language and multi-modal models.
oumi-ai/oumi
An end-to-end platform for fine-tuning, evaluating, and deploying open-source Large Language Models (LLMs) and Vision Language Models (VLMs).
roboflow/maestro
A streamlined tool to accelerate the fine-tuning of popular multimodal models like Florence-2, PaliGemma 2, and Qwen2.5-VL.
emcf/thepipe
A Python library for extracting clean markdown, multimodal media, and structured data from complex documents using vision-language models.
om-ai-lab/OmAgent
A Python library simplifying the development of multimodal language agents by abstracting complex engineering and providing native multimodal support.
NexaAI/nexa-sdk
A high-performance SDK enabling day-0 local inference of frontier LLMs and VLMs across diverse hardware (NPU, GPU, CPU) and platforms (PC, mobile, IoT) with minimal energy.