Amazon Neptune · Arazzo Workflow

Amazon Neptune Bulk Loader Start and Poll

Version 1.0.0

Start a bulk loader job from S3 and poll its status until the load completes.

1 workflow 1 source API 1 provider
View Spec View on GitHub DatabaseGraph DatabaseGremlinNeptuneProperty GraphRDFSPARQLArazzoWorkflows

Provider

amazon-neptune

Workflows

bulk-loader-start-and-poll
Kick off a bulk load from S3 and poll until the load is complete.
Starts a bulk loader job, then polls its status on a loop until the overall status is LOAD_COMPLETED.
2 steps inputs: failOnError, format, iamRoleArn, parallelism, region, source outputs: loadId, overallStatus, totalRecords
1
startLoad
startLoaderJob
Start a bulk loader job from the supplied S3 source and capture the returned load id.
2
pollLoad
getLoaderJobStatus
Poll the loader job status with detailed feed counts. Repeat while the overall status is LOAD_IN_PROGRESS, and finish once it is LOAD_COMPLETED.

Source API Descriptions

Arazzo Workflow Specification

amazon-neptune-bulk-loader-poll-workflow.yml Raw ↑
arazzo: 1.0.1
info:
  title: Amazon Neptune Bulk Loader Start and Poll
  summary: Start a bulk loader job from S3 and poll its status until the load completes.
  description: >-
    The canonical Neptune bulk-load pattern over the Data API. The workflow
    starts a bulk loader job from an Amazon S3 source, captures the returned load
    id, then repeatedly polls the loader job status with detailed feed counts
    until the overall status reaches LOAD_COMPLETED. A poll loop with a retry
    delay handles the LOAD_IN_PROGRESS state, and a branch ends the flow once the
    load finishes. Every step spells out its request inline so the flow can be
    read and executed without opening the underlying OpenAPI description.
  version: 1.0.0
sourceDescriptions:
- name: neptuneDataApi
  url: ../openapi/amazon-neptune-data-openapi.yml
  type: openapi
workflows:
- workflowId: bulk-loader-start-and-poll
  summary: Kick off a bulk load from S3 and poll until the load is complete.
  description: >-
    Starts a bulk loader job, then polls its status on a loop until the overall
    status is LOAD_COMPLETED.
  inputs:
    type: object
    required:
    - source
    - format
    - iamRoleArn
    - region
    properties:
      source:
        type: string
        description: The S3 URI of the data files or folders to load.
      format:
        type: string
        description: The data format (csv, opencypher, ntriples, nquads, rdfxml, turtle).
      iamRoleArn:
        type: string
        description: The ARN of the IAM role with S3 access.
      region:
        type: string
        description: The AWS Region of the S3 bucket.
      parallelism:
        type: string
        description: The degree of parallelism (LOW, MEDIUM, HIGH, OVERSUBSCRIBE).
      failOnError:
        type: string
        description: Whether to stop the load on error (TRUE or FALSE).
  steps:
  - stepId: startLoad
    description: >-
      Start a bulk loader job from the supplied S3 source and capture the
      returned load id.
    operationId: startLoaderJob
    requestBody:
      contentType: application/json
      payload:
        source: $inputs.source
        format: $inputs.format
        iamRoleArn: $inputs.iamRoleArn
        region: $inputs.region
        parallelism: $inputs.parallelism
        failOnError: $inputs.failOnError
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      loadId: $response.body#/payload/loadId
  - stepId: pollLoad
    description: >-
      Poll the loader job status with detailed feed counts. Repeat while the
      overall status is LOAD_IN_PROGRESS, and finish once it is LOAD_COMPLETED.
    operationId: getLoaderJobStatus
    parameters:
    - name: loadId
      in: path
      value: $steps.startLoad.outputs.loadId
    - name: details
      in: query
      value: true
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      overallStatus: $response.body#/payload/overallStatus/status
      totalRecords: $response.body#/payload/overallStatus/totalRecords
    onSuccess:
    - name: loadStillRunning
      type: retry
      retryAfter: 30
      retryLimit: 60
      criteria:
      - context: $response.body
        condition: $.payload.overallStatus.status == "LOAD_IN_PROGRESS"
        type: jsonpath
    - name: loadComplete
      type: end
      criteria:
      - context: $response.body
        condition: $.payload.overallStatus.status == "LOAD_COMPLETED"
        type: jsonpath
  outputs:
    loadId: $steps.startLoad.outputs.loadId
    overallStatus: $steps.pollLoad.outputs.overallStatus
    totalRecords: $steps.pollLoad.outputs.totalRecords