Cloudflare's New API for Website Crawling

You can now crawl an entire website with a single API call using Browser Rendering's new /crawl endpoint, available in open beta. Submit a starting URL, and pages are automatically discovered, rendered in a headless browser, and returned in multiple formats, including HTML, Markdown, and structured JSON.

Overview

The endpoint is a signed-agent that respects robots.txt and AI Crawl Control by default, making it easy for developers to comply with website rules, and making it less likely for crawlers to ignore web-owner guidance. This is great for training models, building RAG pipelines, and researching or monitoring content across a site.

How It Works

Crawl jobs run asynchronously. You submit a URL, receive a job ID, and check back for results as pages are processed.

Initiating a Crawl

bash.txt

curl -X POST 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl' \\
-H 'Authorization: Bearer <apiToken>' \\
-H 'Content-Type: application/json' \\
-d '{ "url": "https://blog.cloudflare.com/" }'

Checking Results

bash.txt

curl -X GET 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl/{job_id}' \\
-H 'Authorization: Bearer <apiToken>' \\
-H 'Content-Type: application/json'

Key Features

Multiple output formats - Return crawled content as HTML, Markdown, and structured JSON (powered by Workers AI)
Crawl scope controls - Configure crawl depth, page limits, and wildcard patterns to include or exclude specific URL paths
Automatic page discovery - Discovers URLs from sitemaps, page links, or both
Incremental crawling - Use modifiedSince and maxAge to skip pages that haven't changed or were recently fetched, saving time and cost on repeated crawls
Static mode - Set render: false to fetch static HTML without spinning up a browser, for faster crawling of static sites
Well-behaved bot - Honors robots.txt directives, including crawl-delay

Availability

Available on both the Workers Free and Paid plans.

Important Notes

The /crawl endpoint cannot bypass Cloudflare bot detection or captchas, and self-identifies as a bot
For setting up your own site to be crawled, review the robots.txt and sitemaps best practices

Getting Started

To get started, refer to the crawl endpoint documentation.

Crawl Entire Websites with a Single API Call Using Browser Rendering

Overview

How It Works

Initiating a Crawl

Checking Results

Key Features

Availability

Important Notes

Getting Started

Comments

Leave a Comment

AI Coding News: July 7, 2026 — Fable 5 Goes Metered, Claude Code & OpenCode Ship New Builds

AI Coding News: July 6, 2026 — Tesla Caps AI Spend at $200/Week, Exempts Grok

AI Coding News: July 5, 2026 — Copilot Ships Vision, Claude Code Tightens Permissions

Overview

How It Works

Initiating a Crawl

Checking Results

Key Features

Availability

Important Notes

Getting Started

Comments

Leave a Comment

Keep reading.

AI Coding News: July 7, 2026 — Fable 5 Goes Metered, Claude Code & OpenCode Ship New Builds

AI Coding News: July 6, 2026 — Tesla Caps AI Spend at $200/Week, Exempts Grok

AI Coding News: July 5, 2026 — Copilot Ships Vision, Claude Code Tightens Permissions