deepseek-ai/Janus
Janus-Series is a family of unified autoregressive multimodal AI models designed for both understanding and generating content across various modalities, featuring a novel decoupled visual encoding strategy.
Core Features
Detailed Introduction
Janus is a novel autoregressive framework that unifies multimodal understanding and generation by decoupling visual encoding into separate pathways within a single transformer architecture. This approach resolves conflicts between understanding and generation roles, enhancing flexibility and achieving state-of-the-art performance. Janus-Pro, an advanced iteration, further optimizes training, expands data, and scales model size, significantly improving multimodal understanding, text-to-image instruction-following, and generation stability. The series represents a strong candidate for next-generation unified multimodal models.