Tags: #web-scraping
firecrawl/firecrawl
An API for AI agents to reliably search, scrape, and interact with the web, providing clean, LLM-ready data at scale.
D4Vinci/Scrapling
An adaptive Python web scraping framework designed to handle everything from single requests to large-scale crawls, featuring anti-bot bypass and intelligent parsing.
browserbase/stagehand
An AI browser automation framework that combines natural language and code for flexible, reliable, and maintainable web control.
browser-use/browser-use
Enables AI agents to interact with and automate tasks on websites, making web content accessible for large language models.
mishushakov/llm-scraper
A TypeScript library that leverages Large Language Models to extract structured data from any webpage.
ScrapeGraphAI/Scrapegraph-ai
A Python library that leverages LLMs and graph logic to simplify web scraping and data extraction from various sources.
rom1504/img2dataset
A highly efficient command-line tool to download, resize, and package large sets of image URLs into machine learning datasets.
vercel-labs/agent-browser
A fast native Rust CLI for browser automation, specifically designed for AI agents.
CloakHQ/CloakBrowser
CloakBrowser is a stealth Chromium browser engineered with C++ source-level fingerprint patches to bypass advanced bot detection, serving as a drop-in replacement for Playwright and Puppeteer.