AI Data Lakehouse Format
6.4k 2026-04-26
lance-format/lance
An open lakehouse format for multimodal AI, offering high-performance random access, vector indexing, and data versioning.
Core Features
Expressive hybrid search combining vector, full-text, and SQL analytics.
Lightning-fast random access, significantly outperforming Parquet/Iceberg.
Native support for multimodal data like images, videos, audio, text, and embeddings.
Efficient data evolution with column additions without full table rewrites.
Zero-copy versioning with ACID transactions, time travel, and branching.
Quick Start
pip install pylanceDetailed Introduction
Lance is an open lakehouse format designed for multimodal AI workloads, providing a file format, table format, and catalog specification. It enables building complete lakehouses on object storage, powering AI workflows such as search engines, feature stores, and large-scale ML training. With its focus on high-performance I/O, random access, and native support for diverse data types, Lance addresses critical challenges in managing and querying complex AI datasets, offering a robust foundation for modern data platforms.