sensible-so · Arazzo Workflow

Sensible Portfolio Extract From URL And Poll

Version 1.0.0

Segment and extract a multi-document portfolio at a URL, then poll until every sub-document extraction completes.

1 workflow 1 source API 1 provider
View Spec View on GitHub ArazzoWorkflows

Provider

sensible-so

Workflows

portfolio-extract-from-url-and-poll
Extract a multi-document portfolio from a URL and poll the portfolio extraction to completion.
Submits a portfolio URL and the document types to segment it into, then polls the returned portfolio id until Sensible reports COMPLETE and returns the per-document extraction results.
2 steps inputs: apiKey, documentUrl, segmentDocumentsWith, types outputs: documents, portfolioId, status
1
submitPortfolio
provide-a-download-url-for-a-pdf-portfolio
Submit the portfolio URL and the document types to segment it into, capturing the returned portfolio extraction id.
2
pollPortfolio
retrieving-results
Poll the portfolio extraction by id until Sensible reports the COMPLETE status, retrying while it is still WAITING or PROCESSING.

Source API Descriptions

Arazzo Workflow Specification

sensible-so-portfolio-extract-from-url-and-poll-workflow.yml Raw ↑
arazzo: 1.0.1
info:
  title: Sensible Portfolio Extract From URL And Poll
  summary: Segment and extract a multi-document portfolio at a URL, then poll until every sub-document extraction completes.
  description: >-
    Handles the multi-document "portfolio" case where several documents are
    bundled into a single file. The workflow submits the portfolio URL together
    with the list of document types Sensible should segment it into, receives a
    portfolio extraction id, and polls the Retrieve extraction by ID endpoint
    until the portfolio reports a COMPLETE status. On completion it surfaces the
    per-document outputs array. Every step spells out its request inline,
    including the Bearer authorization.
  version: 1.0.0
sourceDescriptions:
- name: extractionsApi
  url: ../openapi/sensible-extractions-api-openapi.yml
  type: openapi
workflows:
- workflowId: portfolio-extract-from-url-and-poll
  summary: Extract a multi-document portfolio from a URL and poll the portfolio extraction to completion.
  description: >-
    Submits a portfolio URL and the document types to segment it into, then
    polls the returned portfolio id until Sensible reports COMPLETE and returns
    the per-document extraction results.
  inputs:
    type: object
    required:
    - apiKey
    - documentUrl
    - types
    properties:
      apiKey:
        type: string
        description: Sensible API key used as the Bearer token.
      documentUrl:
        type: string
        description: A publicly accessible or presigned URL returning the portfolio PDF bytes.
      types:
        type: array
        description: The document types contained in the portfolio (e.g. ["tax_returns","bank_statements"]).
        items:
          type: string
      segmentDocumentsWith:
        type: string
        description: How to segment the portfolio page ranges.
        enum:
        - llm
        - fingerprints
        default: fingerprints
  steps:
  - stepId: submitPortfolio
    description: >-
      Submit the portfolio URL and the document types to segment it into,
      capturing the returned portfolio extraction id.
    operationId: provide-a-download-url-for-a-pdf-portfolio
    parameters:
    - name: Authorization
      in: header
      value: "Bearer $inputs.apiKey"
    requestBody:
      contentType: application/json
      payload:
        document_url: $inputs.documentUrl
        types: $inputs.types
        segment_documents_with: $inputs.segmentDocumentsWith
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      portfolioId: $response.body#/id
      status: $response.body#/status
  - stepId: pollPortfolio
    description: >-
      Poll the portfolio extraction by id until Sensible reports the COMPLETE
      status, retrying while it is still WAITING or PROCESSING.
    operationId: retrieving-results
    parameters:
    - name: Authorization
      in: header
      value: "Bearer $inputs.apiKey"
    - name: id
      in: path
      value: $steps.submitPortfolio.outputs.portfolioId
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      status: $response.body#/status
      documents: $response.body#/documents
      coverage: $response.body#/coverage
      validationSummary: $response.body#/validation_summary
    onSuccess:
    - name: portfolioComplete
      type: end
      criteria:
      - context: $response.body
        condition: $.status == "COMPLETE"
        type: jsonpath
    - name: keepPolling
      type: goto
      stepId: pollPortfolio
      criteria:
      - context: $response.body
        condition: $.status == "WAITING" || $.status == "PROCESSING"
        type: jsonpath
  outputs:
    portfolioId: $steps.submitPortfolio.outputs.portfolioId
    status: $steps.pollPortfolio.outputs.status
    documents: $steps.pollPortfolio.outputs.documents