Deep Learning Model Implementation
8.4k 2026-04-18
lucidrains/imagen-pytorch
A PyTorch implementation of Google's Imagen, a state-of-the-art text-to-image neural network that surpasses DALL-E2 in synthesis quality.
Core Features
PyTorch implementation of Google's Imagen.
State-of-the-art text-to-image synthesis.
Cascading DDPM architecture with T5 text embeddings.
Features dynamic clipping and memory-efficient UNet design.
Simpler architecture compared to DALL-E2.
Quick Start
pip install imagen-pytorchDetailed Introduction
This project provides a PyTorch implementation of Google's Imagen, a groundbreaking text-to-image neural network recognized as the new state-of-the-art, outperforming DALL-E2. Architecturally, it leverages a cascading DDPM (Denoising Diffusion Probabilistic Model) conditioned on text embeddings from a large pretrained T5 model. Key innovations include dynamic clipping for improved classifier-free guidance, noise level conditioning, and a memory-efficient UNet design, demonstrating that complex prior networks like CLIP are not always necessary for superior generative performance.