mishushakov/llm-scraper - OSS Alternative - Discover Top Open Source Alternatives to Popular Software
AI-powered Web Scraping Library
6.4k 2026-04-26

mishushakov/llm-scraper

A TypeScript library that leverages Large Language Models to extract structured data from any webpage.

Core Features

Supports a wide range of LLM models including GPT, Sonnet, Gemini, Llama, and Qwen.
Defines data extraction schemas using Zod or JSON Schema for type-safety.
Built on Playwright for robust browser automation and content loading.
Offers streaming of extracted objects and automatic code generation for scraping scripts.
Provides six flexible content formatting modes for diverse scraping needs.

Quick Start

npm i zod playwright llm-scraper

Detailed Introduction

LLM Scraper is a powerful TypeScript library designed to revolutionize web data extraction by integrating Large Language Models (LLMs). It empowers developers to transform unstructured content from any webpage into well-defined, structured data using schemas like Zod or JSON Schema. Built upon the robust Playwright framework, it ensures reliable browser automation and offers advanced features such as real-time streaming of extracted objects and automatic code generation for scraping scripts. Its versatility in handling various content formats and supporting a wide range of LLMs makes it an indispensable tool for intelligent and efficient data acquisition.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.