Data Pipeline Framework
10.8k 2026-04-13
kedro-org/kedro
A Python framework for building reproducible, maintainable, and modular data engineering and data science pipelines using software engineering best practices.
Core Features
Standard project template based on Cookiecutter Data Science
Lightweight Data Catalog for various file formats and systems, including data/model versioning
Pipeline Abstraction with automatic dependency resolution and visualization (Kedro-Viz)
Quick Start
uv pip install kedroDetailed Introduction
Kedro is an open-source Python framework designed to bring software engineering best practices to data science and data engineering workflows. It enables data professionals to build robust, reproducible, and modular data pipelines, moving projects from experimentation to production with greater efficiency. Hosted by the LF AI & Data Foundation, Kedro provides tools for managing data, defining pipeline structures, and ensuring code quality, making complex data projects more manageable and scalable.