Tags: #etl - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Tags: #etl

Data Integration Platform
SeaTunnel Zeta Engine
9.3k

apache/seatunnel

SeaTunnel is a high-performance, distributed data integration tool designed to synchronize massive amounts of multimodal data from diverse sources with efficiency and stability.

Data Processing Library
python
14.6k

Unstructured-IO/unstructured

An open-source ETL solution for transforming complex documents into clean, structured data formats, optimized for language models.

Workflow Orchestration Platform
Python
45.2k

apache/airflow

A platform to programmatically author, schedule, and monitor data workflows.

Data Pipeline Framework
Python
10.9k

kedro-org/kedro

A Python framework for building reproducible, maintainable, and modular data engineering and data science pipelines using software engineering best practices.

Dataflow Orchestration Library
Python
2.5k

apache/hamilton

Apache Hamilton is a lightweight Python library that enables data scientists and engineers to define testable, modular, and self-documenting dataflows (DAGs) with built-in lineage and metadata, portable across any Python environment.

Data Orchestration Platform
Python
15.4k

dagster-io/dagster

A cloud-native data pipeline orchestrator designed for the development, production, and observation of data assets, featuring integrated lineage, observability, and a declarative programming model.

Workflow Orchestration Framework
python
22.3k

PrefectHQ/prefect

Prefect is a Python-based workflow orchestration framework designed to build resilient, dynamic data pipelines that automate processes and recover from unexpected changes.

Data Orchestration Platform
OpenJDK
1.4k

apache/hop

An open-source platform designed to facilitate all aspects of data and metadata orchestration, enabling efficient data integration and pipeline management.

Airflow Extension for dbt Orchestration
apache airflow
1.2k

astronomer/astronomer-cosmos

Integrate dbt Core projects seamlessly into Apache Airflow DAGs and Task Groups, enabling robust data transformation orchestration.

LLM-powered Data Processing Framework
Python
3.7k

ucbepic/docetl

DocETL is an agentic LLM-powered framework designed for building and executing complex data processing and ETL pipelines, especially for documents.

AI Agent Context Management Framework
Python
9.6k

cocoindex-io/cocoindex

CocoIndex is an incremental data indexing framework that provides continuously fresh context from diverse enterprise data sources for AI agents and LLM applications.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.