TypeScript Library
6.3k 2026-04-13
mishushakov/llm-scraper
A TypeScript library to extract structured data from any webpage using Large Language Models (LLMs).
Core Features
Leverages LLMs (GPT, Sonnet, Gemini, Llama, Qwen) for intelligent data extraction.
Defines data schemas using Zod or JSON Schema for type-safe output.
Built on Playwright for robust browser automation and content loading.
Supports various content formatting modes (HTML, Markdown, Text, Image, Custom).
Offers streaming capabilities for partial object results and code generation.
Quick Start
npm i zod playwright llm-scraperDetailed Introduction
LLM Scraper is a powerful TypeScript library designed to revolutionize web data extraction by integrating Large Language Models (LLMs) with the robust Playwright framework. It enables developers to effortlessly transform unstructured web content from any webpage into precisely defined structured data using Zod or JSON schemas. With features like multi-model LLM support, type-safety, streaming, and various content formatting options, it simplifies complex scraping tasks, making data acquisition more efficient and reliable for modern applications.