TypeScript Library
6.3k 2026-04-13

mishushakov/llm-scraper

A TypeScript library to extract structured data from any webpage using Large Language Models (LLMs).

Core Features

Leverages LLMs (GPT, Sonnet, Gemini, Llama, Qwen) for intelligent data extraction.
Defines data schemas using Zod or JSON Schema for type-safe output.
Built on Playwright for robust browser automation and content loading.
Supports various content formatting modes (HTML, Markdown, Text, Image, Custom).
Offers streaming capabilities for partial object results and code generation.

Quick Start

npm i zod playwright llm-scraper

Detailed Introduction

LLM Scraper is a powerful TypeScript library designed to revolutionize web data extraction by integrating Large Language Models (LLMs) with the robust Playwright framework. It enables developers to effortlessly transform unstructured web content from any webpage into precisely defined structured data using Zod or JSON schemas. With features like multi-model LLM support, type-safety, streaming, and various content formatting options, it simplifies complex scraping tasks, making data acquisition more efficient and reliable for modern applications.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.