ChatGPT · Arazzo Workflow
ChatGPT Describe an Image Input
Version 1.0.0
Send an image URL to the Responses API and retrieve a text description.
View Spec
View on GitHub
AgentsAIChatGPTEmbeddingsFine-TuningGPT-4GPT-5Language ModelOpenAIRealtimeArazzoWorkflows
Provider
Workflows
image-input-describe
Describe an image supplied by URL using a multimodal Responses API call.
Creates a response from a multimodal input combining an instruction and an input_image content part, polls to completion, and returns the description text.
1
describeImage
createResponse
Create a stored response with a multimodal input that pairs the text instruction with an input_image content part.
2
pollDescription
getResponse
Poll the response until image understanding finishes and it leaves the in_progress status.
3
retrieveDescription
getResponse
Retrieve the settled response and extract the description text and token usage, including the image URL in the returned items.