Adobe · Arazzo Workflow
Adobe Extract Content From a PDF
Version 1.0.0
Upload a PDF, extract text and tables into structured JSON, poll the job, and fetch the result.
View Spec
View on GitHub
AnalyticsCreative CloudDigital Asset ManagementDocument ServicesE-CommerceE-SignaturesExperience CloudGenerative AIMarketingPDFWork ManagementArazzoWorkflows
Provider
Workflows
extract-pdf
Extract structured text and tables from an uploaded PDF.
Requests an upload slot for the source PDF, submits an extractPDF job for the requested elements and table format, polls job status until extraction finishes, and retrieves the download URI for the structured output ZIP.
1
requestUpload
uploadAsset
Request a pre-signed upload URI and asset ID for the source PDF, which is then PUT to the returned uploadUri out of band.
2
submitExtract
extractPDF
Submit an asynchronous extractPDF job that extracts structured content from the uploaded PDF. Returns 201 with an in-progress job status.
3
pollStatus
getJobStatus
Poll the extractPDF job until it is no longer in progress, looping back while the status remains "in progress".
4
getOutput
getAsset
Resolve a pre-signed download URI for the extracted output ZIP.