Cross-Provider Workflow

Snowflake Load to DataHub Catalog Register

Version 1.0.0

Refresh a Snowflake Snowpipe to load data, then register the target dataset in DataHub.

1 workflow 2 source APIs 2 providers
View Spec View on GitHub ArazzoWorkflowsCross-Provider

Providers Orchestrated

snowflake datahub

Workflows

snowflake-load-to-catalog
Refresh a Snowflake pipe, then register the loaded dataset in DataHub.
Refreshes a Snowflake Snowpipe to load staged files, then upserts the target dataset entity into the DataHub catalog for discovery and governance.
2 steps inputs: database, datasetUrn, name, schema outputs: refreshStatus, registeredUrn
1
refresh-pipe
$sourceDescriptions.snowflakePipeApi.refreshPipe
Refresh the Snowflake pipe to ingest staged files.
2
register-dataset
$sourceDescriptions.datahubApi.upsertEntities
Register the loaded dataset entity in the DataHub catalog.

Source API Descriptions

Arazzo Workflow Specification

data-snowflake-load-to-datahub-catalog.yml Raw ↑
arazzo: 1.0.1
info:
  title: Snowflake Load to DataHub Catalog Register
  summary: Refresh a Snowflake Snowpipe to load data, then register the target dataset in DataHub.
  description: >-
    A data pipeline workflow that refreshes a Snowflake pipe to ingest staged files into
    a table, then registers the loaded dataset as an entity in the DataHub catalog so the
    newly populated table is discoverable and governed. Demonstrates connecting a
    warehouse load process with a metadata catalog.
  version: 1.0.0
sourceDescriptions:
  - name: snowflakePipeApi
    url: https://raw.githubusercontent.com/api-evangelist/snowflake/refs/heads/main/openapi/pipe.yaml
    type: openapi
  - name: datahubApi
    url: https://raw.githubusercontent.com/api-evangelist/datahub/refs/heads/main/openapi/datahub-openapi-openapi.yml
    type: openapi
workflows:
  - workflowId: snowflake-load-to-catalog
    summary: Refresh a Snowflake pipe, then register the loaded dataset in DataHub.
    description: >-
      Refreshes a Snowflake Snowpipe to load staged files, then upserts the target
      dataset entity into the DataHub catalog for discovery and governance.
    inputs:
      type: object
      properties:
        database:
          type: string
        schema:
          type: string
        name:
          type: string
        datasetUrn:
          type: string
    steps:
      - stepId: refresh-pipe
        description: Refresh the Snowflake pipe to ingest staged files.
        operationId: $sourceDescriptions.snowflakePipeApi.refreshPipe
        parameters:
          - name: database
            in: path
            value: $inputs.database
          - name: schema
            in: path
            value: $inputs.schema
          - name: name
            in: path
            value: $inputs.name
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          refreshStatus: $statusCode
      - stepId: register-dataset
        description: Register the loaded dataset entity in the DataHub catalog.
        operationId: $sourceDescriptions.datahubApi.upsertEntities
        requestBody:
          contentType: application/json
          payload:
            - urn: $inputs.datasetUrn
              aspects:
                datasetProperties:
                  name: $inputs.name
                  qualifiedName: $inputs.database
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          registeredUrn: $response.body#/0/urn
    outputs:
      refreshStatus: $steps.refresh-pipe.outputs.refreshStatus
      registeredUrn: $steps.register-dataset.outputs.registeredUrn