Multi-Page Extraction

Extract and compare data across multiple pages

Some extractions span more than one page — paginated results, detail pages linked from a list, or data from two structurally identical pages you want to compare.

Two approaches

The default single-page agent can navigate across pages as part of its exploration. It clicks "Next page" buttons, follows detail page links, and collects data across the navigation path. This works well for:

Paginated lists (click-based pagination or infinite scroll)
List → detail page flows (collect a list, visit each item for more fields)
Category tabs or filter-based navigation

Describe the navigation intent in your website description:

"This is a paginated event listing. Scroll through all pages using the 'Load more' button and collect all events."

"Each event card links to a detail page with the full schedule. Visit each detail page to get the full lineup."

The agent uses navigation tools (click, scroll, wait) to reach the data before writing its extraction script.

2. Multi-page agent (two-page comparison)

The multi-page agent is designed for a specific use case: two pages from the same site that are structurally identical (e.g., two match pages, two product pages from the same store). It:

Loads both pages simultaneously
Reads the cleaned HTML from each
Generates a single extraction script that works for both
Returns one result object per page

This is a fast, one-shot extraction — no validation loop, no network inspection. It works best for DOM-consistent pages where the single-page agent's deeper analysis is not needed.

Use the multi-page agent when:

You want to compare the same data across two pages on the same site
The site doesn't expose data via APIs and DOM extraction is sufficient
You need results from both pages in a single run

Performance notes

Multi-page extractions take longer per run than single-page ones. Each additional page requires a browser load, rendering time, and extraction passes. For large page sets (50+), consider:

Using a playbook (skips AI exploration on subsequent runs)
Batching with a workflow (run multiple playbooks in parallel)
Starting with a small page count to validate the schema before scaling up

PreviousExamples NextTutorial: YC Daily Hacker News

Multi-Page Extraction

Two approaches

1. Single-page agent with navigation

2. Multi-page agent (two-page comparison)

Performance notes