LTX-Video

Official repository for LTX-Video

Introduction

LTX-Video is the first DiT-based video generation model that can generate high-quality videos in real-time. It can generate 30 FPS videos at 1216×704 resolution, faster than it takes to watch them. The model is trained on a large-scale dataset of diverse videos and can generate high-resolution videos with realistic and diverse content.

The model supports text-to-image, image-to-video, keyframe-based animation, video extension (both forward and backward), video-to-video transformations, and any combination of these features.

Models & Workflows

Name	Notes	inference.py config	ComfyUI workflow (Recommended)
ltxv-13b-0.9.7-dev	Highest quality, requires more VRAM	ltxv-13b-0.9.7-dev.yaml	ltxv-13b-i2v-base.json
ltxv-13b-0.9.7-mix	Mix ltxv-13b-dev and ltxv-13b-distilled in the same multi-scale rendering workflow for balanced speed-quality	N/A	ltxv-13b-i2v-mixed-multiscale.json
ltxv-13b-0.9.7-distilled	Faster, less VRAM usage, slight quality reduction compared to 13b. Ideal for rapid iterations	ltxv-13b-0.9.7-distilled.yaml	ltxv-13b-dist-i2v-base.json
ltxv-13b-0.9.7-distilled-lora128	LoRA to make ltxv-13b-dev behave like the distilled model	N/A	N/A
ltxv-13b-0.9.7-fp8	Quantized version of ltxv-13b	Coming soon	ltxv-13b-i2v-base-fp8.json
ltxv-13b-0.9.7-distilled-fp8	Quantized version of ltxv-13b-distilled	Coming soon	ltxv-13b-dist-i2v-base-fp8.json
ltxv-2b-0.9.6	Good quality, lower VRAM requirement than ltxv-13b	ltxv-2b-0.9.6-dev.yaml	ltxvideo-i2v.json
ltxv-2b-0.9.6-distilled	15× faster, real-time capable, fewer steps needed, no STG/CFG required	ltxv-2b-0.9.6-distilled.yaml	ltxvideo-i2v-distilled.json

Quick Start Guide

Online inference

The model is accessible right away via the following links:

Run locally

Installation

The codebase was tested with Python 3.10.5, CUDA version 12.2, and supports PyTorch >= 2.1.2. On macos, MPS was tested with PyTorch 2.3.0, and should support PyTorch == 2.3 or >= 2.6.

git clone https://github.com/Lightricks/LTX-Video.git
cd LTX-Video

# create env
python -m venv env
source env/bin/activate
python -m pip install -e .\[inference-script\]

Inference

📝 Note: For best results, we recommend using our ComfyUI workflow. We’re working on updating the inference.py script to match the high quality and output fidelity of ComfyUI.

To use our model, please follow the inference code in inference.py:

For text-to-video generation:

python inference.py --prompt "PROMPT" --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-distilled.yaml

For image-to-video generation:

python inference.py --prompt "PROMPT" --conditioning_media_paths IMAGE_PATH --conditioning_start_frames 0 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-distilled.yaml

Extending a video:

📝 Note: Input video segments must contain a multiple of 8 frames plus 1 (e.g., 9, 17, 25, etc.), and the target frame number should be a multiple of 8.

python inference.py --prompt "PROMPT" --conditioning_media_paths VIDEO_PATH --conditioning_start_frames START_FRAME --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-distilled.yaml

For video generation with multiple conditions:

You can now generate a video conditioned on a set of images and/or short video segments. Simply provide a list of paths to the images or video segments you want to condition on, along with their target frame numbers in the generated video. You can also specify the conditioning strength for each item (default: 1.0).

python inference.py --prompt "PROMPT" --conditioning_media_paths IMAGE_OR_VIDEO_PATH_1 IMAGE_OR_VIDEO_PATH_2 --conditioning_start_frames TARGET_FRAME_1 TARGET_FRAME_2 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-distilled.yaml

ComfyUI Integration

To use our model with ComfyUI, please follow the instructions at https://github.com/Lightricks/ComfyUI-LTXVideo/.

Diffusers Integration

To use our model with the Diffusers Python library, check out the official documentation.

Diffusers also support an 8-bit version of LTX-Video, see details below

Model User Guide

📝 Prompt Engineering

When writing prompts, focus on detailed, chronological descriptions of actions and scenes. Include specific movements, appearances, camera angles, and environmental details - all in a single flowing paragraph. Start directly with the action, and keep descriptions literal and precise. Think like a cinematographer describing a shot list. Keep within 200 words. For best results, build your prompts using this structure:

Start with main action in a single sentence
Add specific details about movements and gestures
Describe character/object appearances precisely
Include background and environment details
Specify camera angles and movements
Describe lighting and colors
Note any changes or sudden events
See examples for more inspiration.

Automatic Prompt Enhancement

When using inference.py, shorts prompts (below prompt_enhancement_words_threshold words) are automatically enhanced by a language model. This is supported with text-to-video and image-to-video (first-frame conditioning).

When using LTXVideoPipeline directly, you can enable prompt enhancement by setting enhance_prompt=True.

🎮 Parameter Guide

Resolution Preset: Higher resolutions for detailed scenes, lower for faster generation and simpler scenes. The model works on resolutions that are divisible by 32 and number of frames that are divisible by 8 + 1 (e.g. 257). In case the resolution or number of frames are not divisible by 32 or 8 + 1, the input will be padded with -1 and then cropped to the desired resolution and number of frames. The model works best on resolutions under 720 x 1280 and number of frames below 257
Seed: Save seed values to recreate specific styles or compositions you like
Guidance Scale: 3-3.5 are the recommended values
Inference Steps: More steps (40+) for quality, fewer steps (20-30) for speed

📝 For advanced parameters usage, please see python inference.py --help

← Back to projects

Info

LTX-Video: Official repository for LTX-Video