Browserless · Arazzo Workflow

Browserless Site Capture Bundle

Version 1.0.0

Render a single URL three ways — HTML content, PNG screenshot, and PDF — in one pass.

1 workflow 1 source API 1 provider
View Spec View on GitHub Headless BrowserBrowser InfrastructureWeb AutomationAI AgentsWeb ScrapingBrowserQLPuppeteerPlaywrightSeleniumCDPStealthCAPTCHA SolvingResidential ProxyPDF GenerationScreenshotsSmart ScrapeCrawlSearchMCPSession RecordingHybrid AutomationArazzoWorkflows

Provider

browserless

Workflows

site-capture-bundle
Capture HTML content, a screenshot, and a PDF of one URL.
Renders the supplied URL as HTML via /chrome/content, then captures a full-page screenshot via /chrome/screenshot, then generates a PDF via /chrome/pdf. All three renders target the same URL, producing a complete capture bundle.
3 steps inputs: blockAds, token, url outputs: html, pdf, screenshot
1
renderContent
{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1content/post
Load the URL and return the fully rendered HTML content after JavaScript has parsed and executed.
2
captureScreenshot
{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1screenshot/post
Capture a PNG screenshot of the same URL. Returns a base64-encoded or binary image payload.
3
generatePdf
{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1pdf/post
Generate a print-ready PDF of the same URL, returning the binary PDF payload.

Source API Descriptions

Arazzo Workflow Specification

browserless-site-capture-bundle-workflow.yml Raw ↑
arazzo: 1.0.1
info:
  title: Browserless Site Capture Bundle
  summary: Render a single URL three ways — HTML content, PNG screenshot, and PDF — in one pass.
  description: >-
    A multi-render capture pipeline that takes one target URL and produces a
    complete archival bundle from the same Browserless Chrome session
    semantics: the rendered HTML content, a full-page PNG screenshot, and a
    print-ready PDF. Each render is an independent POST against the Chrome REST
    APIs, with the shared URL fanned out to all three. Every step spells out its
    token query parameter and JSON request body inline so the flow can be read
    and executed without opening the underlying OpenAPI description.
  version: 1.0.0
sourceDescriptions:
- name: browserlessApi
  url: ../openapi/browserless-openapi.yml
  type: openapi
workflows:
- workflowId: site-capture-bundle
  summary: Capture HTML content, a screenshot, and a PDF of one URL.
  description: >-
    Renders the supplied URL as HTML via /chrome/content, then captures a
    full-page screenshot via /chrome/screenshot, then generates a PDF via
    /chrome/pdf. All three renders target the same URL, producing a complete
    capture bundle.
  inputs:
    type: object
    required:
    - token
    - url
    properties:
      token:
        type: string
        description: The Browserless authorization token passed as a query parameter.
      url:
        type: string
        description: The URL of the page to capture.
      blockAds:
        type: boolean
        description: Whether to load ad-blocking extensions for the session.
  steps:
  - stepId: renderContent
    description: >-
      Load the URL and return the fully rendered HTML content after JavaScript
      has parsed and executed.
    operationPath: '{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1content/post'
    parameters:
    - name: token
      in: query
      value: $inputs.token
    - name: blockAds
      in: query
      value: $inputs.blockAds
    requestBody:
      contentType: application/json
      payload:
        url: $inputs.url
        bestAttempt: true
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      html: $response.body
  - stepId: captureScreenshot
    description: >-
      Capture a PNG screenshot of the same URL. Returns a base64-encoded or
      binary image payload.
    operationPath: '{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1screenshot/post'
    parameters:
    - name: token
      in: query
      value: $inputs.token
    - name: blockAds
      in: query
      value: $inputs.blockAds
    requestBody:
      contentType: application/json
      payload:
        url: $inputs.url
        bestAttempt: true
        options:
          fullPage: true
          type: png
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      screenshot: $response.body
  - stepId: generatePdf
    description: >-
      Generate a print-ready PDF of the same URL, returning the binary PDF
      payload.
    operationPath: '{$sourceDescriptions.browserlessApi.url}#/paths/~1chrome~1pdf/post'
    parameters:
    - name: token
      in: query
      value: $inputs.token
    - name: blockAds
      in: query
      value: $inputs.blockAds
    requestBody:
      contentType: application/json
      payload:
        url: $inputs.url
        bestAttempt: true
        options:
          printBackground: true
          format: A4
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      pdf: $response.body
  outputs:
    html: $steps.renderContent.outputs.html
    screenshot: $steps.captureScreenshot.outputs.screenshot
    pdf: $steps.generatePdf.outputs.pdf