OSS Alternative - Discover Top Open Source Alternatives to Popular Software

OpenGVLab/InternVideo

A series of video foundation models and large-scale datasets designed for comprehensive multimodal video understanding and generation.

Core Features

Comprehensive video foundation models (InternVideo, InternVideo2, InternVideo2.5, InternVideo-Next).

Large-scale video-text datasets (InternVid) for training and evaluation.

Support for multimodal video understanding and generation tasks.

Integration with large language models for video-centric dialogue systems.

Continuous development with new models and data releases.

Detailed Introduction

InternVideo is a pioneering project dedicated to advancing video foundation models and multimodal understanding. It encompasses a growing family of models, including InternVideo, InternVideo2, InternVideo2.5, and InternVideo-Next, each pushing the boundaries of generative and discriminative learning for video. Complementing these models is InternVid, a massive video-text dataset crucial for training robust multimodal AI systems. The project aims to enable sophisticated video analysis, generation, and interaction, laying the groundwork for future applications in genuine world understanding and video-centric AI.