Cross-Provider Workflow

dbt Run to Snowflake Query to DataHub Register

Version 1.0.0

Run a dbt job, validate the output in Snowflake, then register the dataset in DataHub.

1 workflow 3 source APIs 3 providers
View Spec View on GitHub ArazzoWorkflowsCross-Provider

Providers Orchestrated

dbt snowflake datahub

Workflows

dbt-snowflake-datahub
Run dbt, validate in Snowflake, then register in DataHub.
Triggers a dbt Cloud job, validates the output table with a Snowflake query, and upserts the dataset into the DataHub catalog.
3 steps inputs: accountId, datasetUrn, jobId, validationStatement, warehouse outputs: registeredUrn, rowCount, runId
1
run-dbt
$sourceDescriptions.dbtAdminApi.triggerJobRun
Trigger the dbt Cloud transformation job run.
2
validate-snowflake
$sourceDescriptions.snowflakeSqlApi.SubmitStatement
Validate the produced table by querying Snowflake.
3
register-datahub
$sourceDescriptions.datahubApi.upsertEntities
Register the validated dataset in the DataHub catalog.

Source API Descriptions

Arazzo Workflow Specification

data-dbt-run-to-snowflake-to-datahub.yml Raw ↑
arazzo: 1.0.1
info:
  title: dbt Run to Snowflake Query to DataHub Register
  summary: Run a dbt job, validate the output in Snowflake, then register the dataset in DataHub.
  description: >-
    An end-to-end data pipeline workflow that triggers a dbt Cloud job to build models,
    queries Snowflake to validate the produced table, and registers the resulting dataset
    as an entity in the DataHub catalog. Demonstrates a transform, validate, and govern
    sequence across a transformation tool, a warehouse, and a metadata catalog.
  version: 1.0.0
sourceDescriptions:
  - name: dbtAdminApi
    url: https://raw.githubusercontent.com/api-evangelist/dbt/refs/heads/main/openapi/dbt-cloud-administrative-api-openapi.yml
    type: openapi
  - name: snowflakeSqlApi
    url: https://raw.githubusercontent.com/api-evangelist/snowflake/refs/heads/main/openapi/sqlapi.yaml
    type: openapi
  - name: datahubApi
    url: https://raw.githubusercontent.com/api-evangelist/datahub/refs/heads/main/openapi/datahub-openapi-openapi.yml
    type: openapi
workflows:
  - workflowId: dbt-snowflake-datahub
    summary: Run dbt, validate in Snowflake, then register in DataHub.
    description: >-
      Triggers a dbt Cloud job, validates the output table with a Snowflake query, and
      upserts the dataset into the DataHub catalog.
    inputs:
      type: object
      properties:
        accountId:
          type: integer
        jobId:
          type: integer
        validationStatement:
          type: string
        warehouse:
          type: string
        datasetUrn:
          type: string
    steps:
      - stepId: run-dbt
        description: Trigger the dbt Cloud transformation job run.
        operationId: $sourceDescriptions.dbtAdminApi.triggerJobRun
        parameters:
          - name: accountId
            in: path
            value: $inputs.accountId
          - name: jobId
            in: path
            value: $inputs.jobId
        requestBody:
          contentType: application/json
          payload:
            cause: Build models for validation and cataloging
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          runId: $response.body#/data/id
      - stepId: validate-snowflake
        description: Validate the produced table by querying Snowflake.
        operationId: $sourceDescriptions.snowflakeSqlApi.SubmitStatement
        requestBody:
          contentType: application/json
          payload:
            statement: $inputs.validationStatement
            warehouse: $inputs.warehouse
            timeout: 120
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          rowCount: $response.body#/resultSetMetaData/numRows
      - stepId: register-datahub
        description: Register the validated dataset in the DataHub catalog.
        operationId: $sourceDescriptions.datahubApi.upsertEntities
        requestBody:
          contentType: application/json
          payload:
            - urn: $inputs.datasetUrn
              aspects:
                datasetProperties:
                  customProperties:
                    dbt_run_id: $steps.run-dbt.outputs.runId
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          registeredUrn: $response.body#/0/urn
    outputs:
      runId: $steps.run-dbt.outputs.runId
      rowCount: $steps.validate-snowflake.outputs.rowCount
      registeredUrn: $steps.register-datahub.outputs.registeredUrn