Mindee · Arazzo Workflow

Mindee OCR Full Text

Version 1.0.0

Enqueue a document for full-page OCR, poll until processed, then read the per-page words and text content.

1 workflow 2 source APIs 1 provider
View Spec View on GitHub Document ParsingOCRIDPAIMachine LearningInvoicesReceiptsIDsComputer VisionArazzoWorkflows

Provider

mindee

Workflows

ocr-full-text
Upload a document for OCR and read the per-page recognized text.
Sends a file to the OCR enqueue endpoint, polls the job until processing finishes, and retrieves the OCR result containing the words and text content for each page.
3 steps inputs: authorization, file, filename, modelId outputs: jobId, pages
1
enqueueOcr
Enqueue_OCR_Product_Inference_v2_products_ocr_enqueue_post
Send the document to the asynchronous OCR queue. Returns a job whose status begins as Processing.
2
pollJob
Get_Job_Status_v2_jobs__job_id__get
Poll the shared jobs endpoint until the OCR job reports Processed or Failed.
3
getResult
Get_OCR_Product_Result_v2_products_ocr_results__inference_id__get
Retrieve the completed OCR inference and read the recognized words and text content for each page of the document.

Source API Descriptions

Arazzo Workflow Specification

mindee-ocr-full-text-workflow.yml Raw ↑
arazzo: 1.0.1
info:
  title: Mindee OCR Full Text
  summary: Enqueue a document for full-page OCR, poll until processed, then read the per-page words and text content.
  description: >-
    Runs full-page optical character recognition over any document using
    Mindee's asynchronous OCR utility. The workflow uploads a file to the OCR
    queue, polls the shared jobs endpoint until the job is Processed, and
    fetches the OCR inference to read the recognized words and full text
    content for each page along with their bounding polygons. Every step spells
    out its request inline so the flow can be read and executed without opening
    the underlying OpenAPI description.
  version: 1.0.0
sourceDescriptions:
- name: ocrApi
  url: ../openapi/mindee-ocr-api-openapi.yml
  type: openapi
- name: jobsApi
  url: ../openapi/mindee-jobs-api-openapi.yml
  type: openapi
workflows:
- workflowId: ocr-full-text
  summary: Upload a document for OCR and read the per-page recognized text.
  description: >-
    Sends a file to the OCR enqueue endpoint, polls the job until processing
    finishes, and retrieves the OCR result containing the words and text
    content for each page.
  inputs:
    type: object
    required:
    - authorization
    - modelId
    - file
    properties:
      authorization:
        type: string
        description: Mindee API key sent in the Authorization header.
      modelId:
        type: string
        description: UUID of the OCR utility model to apply.
      file:
        type: string
        description: The document file to upload as binary form data.
      filename:
        type: string
        description: Optional filename to associate with the uploaded document.
  steps:
  - stepId: enqueueOcr
    description: >-
      Send the document to the asynchronous OCR queue. Returns a job whose
      status begins as Processing.
    operationId: Enqueue_OCR_Product_Inference_v2_products_ocr_enqueue_post
    parameters:
    - name: Authorization
      in: header
      value: $inputs.authorization
    requestBody:
      contentType: multipart/form-data
      payload:
        model_id: $inputs.modelId
        file: $inputs.file
        filename: $inputs.filename
    successCriteria:
    - condition: $statusCode == 202
    outputs:
      jobId: $response.body#/job/id
      status: $response.body#/job/status
  - stepId: pollJob
    description: >-
      Poll the shared jobs endpoint until the OCR job reports Processed or
      Failed.
    operationId: Get_Job_Status_v2_jobs__job_id__get
    parameters:
    - name: Authorization
      in: header
      value: $inputs.authorization
    - name: job_id
      in: path
      value: $steps.enqueueOcr.outputs.jobId
    - name: redirect
      in: query
      value: false
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      status: $response.body#/job/status
    onSuccess:
    - name: jobProcessed
      type: goto
      stepId: getResult
      criteria:
      - context: $response.body
        condition: $.job.status == "Processed"
        type: jsonpath
    - name: jobPending
      type: goto
      stepId: pollJob
      criteria:
      - context: $response.body
        condition: $.job.status == "Processing"
        type: jsonpath
  - stepId: getResult
    description: >-
      Retrieve the completed OCR inference and read the recognized words and
      text content for each page of the document.
    operationId: Get_OCR_Product_Result_v2_products_ocr_results__inference_id__get
    parameters:
    - name: Authorization
      in: header
      value: $inputs.authorization
    - name: inference_id
      in: path
      value: $steps.enqueueOcr.outputs.jobId
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      inferenceId: $response.body#/inference/id
      pages: $response.body#/inference/result/pages
  outputs:
    jobId: $steps.enqueueOcr.outputs.jobId
    pages: $steps.getResult.outputs.pages