Tags: #vlm

AI/ML Fine-tuning Framework

70.8k

hiyouga/LlamaFactory

A unified and efficient framework for fine-tuning over 100 large language models (LLMs) and vision-language models (VLMs) with both CLI and Web UI.

llm fine-tuning machine-learning

Details

AI/ML Deployment Toolkit

python

3.7k

PaddlePaddle/FastDeploy

A high-performance inference and deployment toolkit for Large Language Models (LLMs) and Vision-Language Models (VLMs) based on PaddlePaddle.

llm vlm inference

Details

Reinforcement Learning Framework

ray

9.4k

OpenRLHF/OpenRLHF

An easy-to-use, scalable, and high-performance open-source framework for Reinforcement Learning from Human Feedback (RLHF), leveraging Ray and vLLM for distributed training of LLMs and VLMs.

rlhf llm vlm

Details

Visual LLM Workflow Builder

ComfyUI

2.2k

heshengtao/comfyui_LLM_party

A ComfyUI-based framework for building comprehensive LLM agent workflows, integrating diverse AI models, tools, and social platforms.

llm agent comfyui rag

Details

Technical Guide & Knowledge Base

cloud computing

17.8k

stas00/ml-engineering

An open collection of methodologies, tools, and step-by-step instructions for successful training, fine-tuning, and inference of large language and multi-modal models.

ml-engineering llm vlm

Details

MLOps Platform

Python

9.2k

oumi-ai/oumi

An end-to-end platform for fine-tuning, evaluating, and deploying open-source Large Language Models (LLMs) and Vision Language Models (VLMs).

llm vlm finetuning

Details

AI Fine-tuning Tool

python

2.7k

roboflow/maestro

A streamlined tool to accelerate the fine-tuning of popular multimodal models like Florence-2, PaliGemma 2, and Qwen2.5-VL.

multimodal-ai fine-tuning vlm

Details

AI-powered Multimodal Data Extraction Library

python

1.5k

emcf/thepipe

A Python library for extracting clean markdown, multimodal media, and structured data from complex documents using vision-language models.

data extraction document processing vlm

Details

AI Agent Development Framework

python

2.6k

om-ai-lab/OmAgent

A Python library simplifying the development of multimodal language agents by abstracting complex engineering and providing native multimodal support.

multimodal-agents llm-framework agent-orchestration

Details

On-device AI Inference SDK

python

8.0k

NexaAI/nexa-sdk

A high-performance SDK enabling day-0 local inference of frontier LLMs and VLMs across diverse hardware (NPU, GPU, CPU) and platforms (PC, mobile, IoT) with minimal energy.

ai llm vlm

Details

On-device AI Inference SDK

python

8.0k

qualcomm/nexa-sdk

A high-performance SDK enabling day-0 local inference of frontier LLMs and VLMs across diverse hardware (NPU, GPU, CPU) and platforms (PC, mobile, IoT) with minimal energy.

ai llm vlm

Details