Amazon Polly · Arazzo Workflow

Amazon Polly Select Voice and Start Synthesis Task

Version 1.0.0

Discover a voice for a language, then start an async synthesis task with it.

1 workflow 1 source API 1 provider

View Spec View on GitHub AIMachine LearningSpeech SynthesisText-To-SpeechTTSVoiceSSMLNeural EngineGenerative AIArazzoWorkflows

Provider

amazon-polly

Workflows

list-voices-start-synthesis-task

Pick an available voice and start an asynchronous synthesis task using it.

Lists voices for the requested engine and language, selects the first match, and starts an asynchronous synthesis task to an S3 bucket using that voice.

2 steps inputs: amzDate, authorization, contentSha256, engine, languageCode, outputFormat, outputS3BucketName, securityToken, text outputs: selectedVoiceId, taskId, taskStatus

describeVoices

DescribeVoices

List the voices available for the requested engine and language.

startTask

StartSpeechSynthesisTask

Start an asynchronous synthesis task to the supplied S3 bucket using the voice selected from the DescribeVoices response.

Source API Descriptions

openapi

pollyApi https://raw.githubusercontent.com/api-evangelist/amazon-polly/refs/heads/main/openapi/amazon-polly-openapi-original.yaml

Arazzo Workflow Specification

arazzo: 1.0.1
info:
  title: Amazon Polly Select Voice and Start Synthesis Task
  summary: Discover a voice for a language, then start an async synthesis task with it.
  description: >-
    Combines voice discovery with long-form asynchronous synthesis. The workflow
    calls DescribeVoices to find an available voice for the requested engine and
    language, captures the first voice id, and then starts a SpeechSynthesisTask
    that writes its output to an S3 bucket using that voice. Each step spells out
    its request inline, including the AWS Signature Version 4 signing headers, so
    the flow can be read and executed without opening the underlying OpenAPI
    description.
  version: 1.0.0
sourceDescriptions:
- name: pollyApi
  url: ../openapi/amazon-polly-openapi-original.yaml
  type: openapi
workflows:
- workflowId: list-voices-start-synthesis-task
  summary: Pick an available voice and start an asynchronous synthesis task using it.
  description: >-
    Lists voices for the requested engine and language, selects the first match,
    and starts an asynchronous synthesis task to an S3 bucket using that voice.
  inputs:
    type: object
    required:
    - amzDate
    - authorization
    - text
    - outputFormat
    - outputS3BucketName
    properties:
      amzDate:
        type: string
        description: The X-Amz-Date timestamp used to sign the requests.
      authorization:
        type: string
        description: The full SigV4 Authorization header value for the request.
      contentSha256:
        type: string
        description: The X-Amz-Content-Sha256 hex digest of the request payload.
      securityToken:
        type: string
        description: Optional X-Amz-Security-Token for temporary credentials.
      engine:
        type: string
        description: Engine to filter voices and to use for synthesis (standard or neural).
      languageCode:
        type: string
        description: ISO language code to filter voices by (e.g. en-US).
      text:
        type: string
        description: The input text (plain text or SSML) to synthesize.
      outputFormat:
        type: string
        description: The audio output format (mp3, ogg_vorbis, pcm, or json).
      outputS3BucketName:
        type: string
        description: The S3 bucket name to which the output file will be saved.
  steps:
  - stepId: describeVoices
    description: List the voices available for the requested engine and language.
    operationId: DescribeVoices
    parameters:
    - name: Engine
      in: query
      value: $inputs.engine
    - name: LanguageCode
      in: query
      value: $inputs.languageCode
    - name: X-Amz-Date
      in: header
      value: $inputs.amzDate
    - name: Authorization
      in: header
      value: $inputs.authorization
    - name: X-Amz-Content-Sha256
      in: header
      value: $inputs.contentSha256
    - name: X-Amz-Security-Token
      in: header
      value: $inputs.securityToken
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      selectedVoiceId: $response.body#/Voices/0/Id
      voices: $response.body#/Voices
  - stepId: startTask
    description: >-
      Start an asynchronous synthesis task to the supplied S3 bucket using the
      voice selected from the DescribeVoices response.
    operationId: StartSpeechSynthesisTask
    parameters:
    - name: X-Amz-Date
      in: header
      value: $inputs.amzDate
    - name: Authorization
      in: header
      value: $inputs.authorization
    - name: X-Amz-Content-Sha256
      in: header
      value: $inputs.contentSha256
    - name: X-Amz-Security-Token
      in: header
      value: $inputs.securityToken
    requestBody:
      contentType: application/json
      payload:
        Engine: $inputs.engine
        LanguageCode: $inputs.languageCode
        OutputFormat: $inputs.outputFormat
        OutputS3BucketName: $inputs.outputS3BucketName
        Text: $inputs.text
        VoiceId: $steps.describeVoices.outputs.selectedVoiceId
    successCriteria:
    - condition: $statusCode == 200
    outputs:
      taskId: $response.body#/SynthesisTask/TaskId
      taskStatus: $response.body#/SynthesisTask/TaskStatus
      outputUri: $response.body#/SynthesisTask/OutputUri
  outputs:
    selectedVoiceId: $steps.describeVoices.outputs.selectedVoiceId
    taskId: $steps.startTask.outputs.taskId
    taskStatus: $steps.startTask.outputs.taskStatus