AI-powered Document Processing Platform
5.2k 2026-04-26
katanaml/sparrow
A production-ready platform for structured data extraction and instruction calling using ML, LLM, and Vision LLM technologies.
Core Features
Universal document processing (invoices, receipts, forms, etc.)
Pluggable architecture for mixing and matching pipelines
Multi-format support (images, multi-page PDFs)
JSON schema-based extraction with automatic validation
API-first design with interactive web UI and real-time processing
Detailed Introduction
Sparrow is an advanced, production-ready platform designed to transform unstructured documents and images into clean, structured data. Leveraging the power of Machine Learning, Large Language Models (LLMs), and Vision LLMs, it automates the extraction of information from various document types like invoices, receipts, and forms. With its pluggable architecture, multi-backend support, and API-first design, Sparrow offers a flexible and scalable solution for businesses seeking to streamline their data processing workflows and integrate AI-driven insights.