Browserless · Arazzo Workflow
Browserless Full Page Archive
Version 1.0.0
Unblock a protected URL, then branch into structured scraping or a content+PDF archive depending on whether unblocked HTML was returned.
View Spec
View on GitHub
Headless BrowserBrowser InfrastructureWeb AutomationAI AgentsWeb ScrapingBrowserQLPuppeteerPlaywrightSeleniumCDPStealthCAPTCHA SolvingResidential ProxyPDF GenerationScreenshotsSmart ScrapeCrawlSearchMCPSession RecordingHybrid AutomationArazzoWorkflows
Provider
Workflows
full-page-archive
Unblock a URL, then scrape structured data or fall back to a content render.
Clears bot detection via /chrome/unblock, then branches: when unblocked HTML content is present it scrapes structured elements via /chrome/scrape using the recovered cookies, otherwise it falls back to a /chrome/content render using the same cookies. Both branches reuse the cleared session.
1
unblockSite
{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1unblock/post
Clear bot detection on the URL, returning cookies, HTML content, and a base64 full-page screenshot.
2
scrapeElements
{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1scrape/post
Extract structured element data for the supplied selectors from the unblocked URL, reusing the cookies recovered from the unblock step.
3
fallbackContent
{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1content/post
Fall back to rendering the page HTML via /chrome/content when the unblock step returned no content, reusing the recovered cookies.