Deep Learning Model Implementation
8.4k 2026-04-18

lucidrains/imagen-pytorch

A PyTorch implementation of Google's Imagen, a state-of-the-art text-to-image neural network that surpasses DALL-E2 in synthesis quality.

Core Features

PyTorch implementation of Google's Imagen.
State-of-the-art text-to-image synthesis.
Cascading DDPM architecture with T5 text embeddings.
Features dynamic clipping and memory-efficient UNet design.
Simpler architecture compared to DALL-E2.

Quick Start

pip install imagen-pytorch

Detailed Introduction

This project provides a PyTorch implementation of Google's Imagen, a groundbreaking text-to-image neural network recognized as the new state-of-the-art, outperforming DALL-E2. Architecturally, it leverages a cascading DDPM (Denoising Diffusion Probabilistic Model) conditioned on text embeddings from a large pretrained T5 model. Key innovations include dynamic clipping for improved classifier-free guidance, noise level conditioning, and a memory-efficient UNet design, demonstrating that complex prior networks like CLIP are not always necessary for superior generative performance.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.