Tags: #data-pipeline - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Tags: #data-pipeline

Observability Data Pipeline
Rust
21.7k

vectordotdev/vector

A high-performance, end-to-end observability data pipeline that empowers users to collect, transform, and route all their logs and metrics with significant cost reduction and enhanced control.

AI/ML Framework
32.5k

microsoft/graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system designed to extract structured data from unstructured text using LLMs to enhance reasoning on private data.

Workflow Orchestration Platform
Python
45.2k

apache/airflow

A platform to programmatically author, schedule, and monitor data workflows.

Data Orchestration Platform
Python
15.4k

dagster-io/dagster

A cloud-native data pipeline orchestrator designed for the development, production, and observation of data assets, featuring integrated lineage, observability, and a declarative programming model.

AI/ML Data Processing Framework
Python
3.4k

towhee-io/towhee

A cutting-edge framework for building fast and simple neural data processing pipelines, especially for unstructured multi-modal data using LLMs.

AI/ML Data Curation Library
python
1.7k

bespokelabsai/curator

A Python library for generating and curating high-quality synthetic data for AI model training and structured data extraction.

Airflow Extension for dbt Orchestration
apache airflow
1.2k

astronomer/astronomer-cosmos

Integrate dbt Core projects seamlessly into Apache Airflow DAGs and Task Groups, enabling robust data transformation orchestration.

Data Orchestration Platform
Docker
14.3k

apache/dolphinscheduler

Apache DolphinScheduler is a modern, low-code data orchestration platform designed for agile development and high-performance management of complex data workflows and task dependencies.

Data Ingestion Engine
Docker
4.7k

jitsucom/jitsu

An open-source, self-hosted Segment alternative for real-time event data collection and streaming to data warehouses.

Replaces:
Details
AI Agent Context Management Framework
Python
9.6k

cocoindex-io/cocoindex

CocoIndex is an incremental data indexing framework that provides continuously fresh context from diverse enterprise data sources for AI agents and LLM applications.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.