Google Cloud Dataflow · Arazzo Workflow

Google Cloud Dataflow Capture Worker Debug Data

Version 1.0.0

Confirm a job, fetch a worker component's debug config, then send a debug capture.

1 workflow 1 source API 1 provider
View Spec View on GitHub Apache BeamBatch ProcessingBig DataData ProcessingETLStream ProcessingArazzoWorkflows

Provider

google-cloud-dataflow

Workflows

capture-worker-debug-data
Confirm a job, get its debug config, and send a debug capture.
Reads a job to confirm it exists, fetches the debug configuration for a worker component, then submits an encoded debug capture for that component.
3 steps inputs: accessToken, componentId, data, jobId, location, projectId, workerId outputs: captureStatus, debugConfig, jobId
1
confirmJob
getLocationJob
Read the job to confirm it exists before requesting debug configuration for one of its worker components.
2
getDebugConfig
getLocationJobDebugConfig
Retrieve the debug configuration for the targeted worker and component of the job.
3
sendDebugCapture
sendLocationJobDebugCapture
Submit the encoded debug capture payload for the same worker component as raw data.

Source API Descriptions

Arazzo Workflow Specification

google-cloud-dataflow-capture-worker-debug-data-workflow.yml Raw ↑
arazzo: 1.0.1
info:
  title: Google Cloud Dataflow Capture Worker Debug Data
  summary: Confirm a job, fetch a worker component's debug config, then send a debug capture.
  description: >-
    Collects worker-level debug data for a specific component of a Dataflow job.
    The workflow reads the job to confirm it exists, retrieves the debug
    configuration for the targeted worker and component, then submits an encoded
    debug capture payload for that same component. Every step spells out its
    request inline, including the inline Bearer authorization Google Cloud
    requires, so the flow can be read and executed without opening the underlying
    OpenAPI description.
  version: 1.0.0
sourceDescriptions:
- name: dataflowApi
  url: ../openapi/google-cloud-dataflow-api-openapi.yml
  type: openapi
workflows:
- workflowId: capture-worker-debug-data
  summary: Confirm a job, get its debug config, and send a debug capture.
  description: >-
    Reads a job to confirm it exists, fetches the debug configuration for a
    worker component, then submits an encoded debug capture for that component.
  inputs:
    type: object
    required:
    - accessToken
    - projectId
    - location
    - jobId
    - workerId
    - componentId
    - data
    properties:
      accessToken:
        type: string
        description: Google Cloud OAuth 2.0 access token used as a Bearer credential.
      projectId:
        type: string
        description: The Google Cloud project id that owns the job.
      location:
        type: string
        description: The regional endpoint that contains the job (e.g. us-central1).
      jobId:
        type: string
        description: The id of the job whose worker debug data is being captured.
      workerId:
        type: string
        description: The worker id that generated the debug data.
      componentId:
        type: string
        description: The internal component id for which debug data is captured.
      data:
        type: string
        description: The encoded debug data to submit.
  steps:
  - stepId: confirmJob
    description: >-
      Read the job to confirm it exists before requesting debug configuration
      for one of its worker components.
    operationId: getLocationJob
    parameters:
    - name: Authorization
      in: header
      value: Bearer $inputs.accessToken
    - name: projectId
      in: path
      value: $inputs.projectId
    - name: location
      in: path
      value: $inputs.location
    - name: jobId
      in: path
      value: $inputs.jobId
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      jobId: $response.body#/id
      currentState: $response.body#/currentState
  - stepId: getDebugConfig
    description: >-
      Retrieve the debug configuration for the targeted worker and component of
      the job.
    operationId: getLocationJobDebugConfig
    parameters:
    - name: Authorization
      in: header
      value: Bearer $inputs.accessToken
    - name: projectId
      in: path
      value: $inputs.projectId
    - name: location
      in: path
      value: $inputs.location
    - name: jobId
      in: path
      value: $inputs.jobId
    requestBody:
      contentType: application/json
      payload:
        workerId: $inputs.workerId
        componentId: $inputs.componentId
        location: $inputs.location
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      config: $response.body#/config
  - stepId: sendDebugCapture
    description: >-
      Submit the encoded debug capture payload for the same worker component as
      raw data.
    operationId: sendLocationJobDebugCapture
    parameters:
    - name: Authorization
      in: header
      value: Bearer $inputs.accessToken
    - name: projectId
      in: path
      value: $inputs.projectId
    - name: location
      in: path
      value: $inputs.location
    - name: jobId
      in: path
      value: $inputs.jobId
    requestBody:
      contentType: application/json
      payload:
        workerId: $inputs.workerId
        componentId: $inputs.componentId
        data: $inputs.data
        dataFormat: RAW
        location: $inputs.location
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      statusCode: $statusCode
  outputs:
    jobId: $steps.confirmJob.outputs.jobId
    debugConfig: $steps.getDebugConfig.outputs.config
    captureStatus: $steps.sendDebugCapture.outputs.statusCode