OpenGVLab/InternVL - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Multimodal AI Model Suite

10.0k 2026-05-06

OpenGVLab/InternVL

A pioneering open-source multimodal large language model family aiming to match or exceed commercial models like GPT-4o/GPT-5 in performance.

Core Features

Pioneering open-source multimodal LLM family.

Achieves state-of-the-art performance across diverse multimodal tasks (general, reasoning, text, agentic).

Offers various model sizes, including large-scale (e.g., 241B) and efficient versions (e.g., 20B).

Open-sources training code, data, and supports HuggingFace `transformers` format.

Incorporates advanced techniques like Variable Visual Position Encoding and Mixed Preference Optimization.

GitHub Repo Full Preview Documentation

Detailed Introduction

InternVL Family is a groundbreaking open-source suite of multimodal large language models (MLLMs) designed to rival and potentially surpass commercial offerings like GPT-4o and GPT-5. Recognized with a CVPR 2024 Oral, it delivers state-of-the-art performance across a spectrum of multimodal, reasoning, text, and agentic tasks. The project emphasizes transparency by open-sourcing its training code and datasets, making advanced multimodal AI accessible to the research community and developers.