Tags: #ocr
docling-project/docling
Docling simplifies document processing, parsing diverse formats including advanced PDF understanding, and provides seamless integrations with the generative AI ecosystem.
run-llama/llama_index
LlamaIndex is an open-source framework designed to build intelligent agentic applications by connecting Large Language Models (LLMs) with private or custom data sources, focusing on document understanding and OCR.
T8RIN/ImageToolbox
A powerful Android application for advanced image manipulation, offering a wide range of tools from basic editing to filters and OCR.
Upsonic/Upsonic
A Python framework for building autonomous and traditional AI agents, offering robust tools, prebuilt components, and integrated OCR capabilities.
icereed/paperless-gpt
An AI-powered add-on for paperless-ngx that leverages LLMs and advanced OCR to automate document title, tag, correspondent, and custom field generation, streamlining digital document management.
katanaml/sparrow
A production-ready platform for structured data extraction and instruction calling using ML, LLM, and Vision LLM technologies.
umlx5h/LLPlayer
An advanced media player designed for language learners, offering dual subtitles, AI-powered real-time translation and subtitle generation, and instant word lookup.
Baiyuetribe/paper2gui
Paper2GUI converts complex AI research papers into user-friendly, install-free desktop applications, making advanced AI accessible to everyone.
pot-app/pot-desktop
A versatile cross-platform desktop application that provides efficient text translation and optical character recognition (OCR) by integrating a wide array of AI and traditional service providers.
AlibabaResearch/AdvancedLiterateMachinery
A research initiative by Alibaba's Tongyi Lab, focusing on developing advanced AI systems capable of reading, thinking, and creating, with an initial emphasis on sophisticated OCR and document understanding technologies.