google/langextract
A Python library leveraging LLMs to extract structured information from unstructured text with precise source grounding and interactive visualization.
Core Features
Quick Start
pip install langextractDetailed Introduction
LangExtract is a powerful Python library designed for extracting structured information from unstructured text documents using Large Language Models (LLMs). It excels in processing diverse materials like clinical notes or reports, identifying and organizing key details while ensuring precise source grounding for every extraction. The library offers reliable structured outputs through schema enforcement, is optimized for handling long documents, and provides interactive visualization for easy review. With flexible support for various LLMs, from cloud-based services like Google Gemini to local models via Ollama, LangExtract adapts to any domain without requiring model fine-tuning, making it a versatile tool for data extraction tasks.