Tags: #data-extraction - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Tags: #data-extraction

AI-powered Web Data API
111.8k

firecrawl/firecrawl

An API for AI agents to reliably search, scrape, and interact with the web, providing clean, LLM-ready data at scale.

Document Processing and AI Data Preparation Library
python
58.5k

docling-project/docling

Docling simplifies document processing, parsing diverse formats including advanced PDF understanding, and provides seamless integrations with the generative AI ecosystem.

Data Processing Library
python
14.6k

Unstructured-IO/unstructured

An open-source ETL solution for transforming complex documents into clean, structured data formats, optimized for language models.

PDF Processing Library / Data Extraction Tool
Java
19.7k

opendataloader-project/opendataloader-pdf

An open-source PDF parser for AI-ready data extraction and automated PDF accessibility remediation, offering benchmark-leading accuracy.

AI-powered Web Scraping Library
Node.js
6.4k

mishushakov/llm-scraper

A TypeScript library that leverages Large Language Models to extract structured data from any webpage.

AI-powered Knowledge Graph Builder
Python
4.6k

neo4j-labs/llm-graph-builder

A powerful application that transforms diverse unstructured data sources into structured Neo4j Knowledge Graphs using Large Language Models (LLMs) and LangChain.

AI-powered Document Processing Platform
Python
5.2k

katanaml/sparrow

A production-ready platform for structured data extraction and instruction calling using ML, LLM, and Vision LLM technologies.

AI-powered Web Scraping Library
Python
23.4k

ScrapeGraphAI/Scrapegraph-ai

A Python library that leverages LLMs and graph logic to simplify web scraping and data extraction from various sources.

AI-powered Multimodal Data Extraction Library
python
1.5k

emcf/thepipe

A Python library for extracting clean markdown, multimodal media, and structured data from complex documents using vision-language models.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.