Runloop · Arazzo Workflow
Runloop Create Scenario and Run It
Version 1.0.0
Define a repeatable AI coding evaluation scenario, start a run of it, poll until it is scored, then complete the run.
View Spec
View on GitHub
AIAI AgentsCoding AgentsSandboxesDevboxesCode ExecutionEvaluationBenchmarksSWE-BenchMCPSnapshotsmicroVMEnterpriseSOC 2ArazzoWorkflows
Provider
Workflows
create-scenario-and-run
Create a scenario, run it, wait for scoring, and complete the run.
Creates a scenario with a scoring contract, starts a run, polls until the run is scored, then completes the run.
1
createScenario
createScenario
Create a scenario with the supplied problem statement and a single bash script scorer weighted at 1.0.
2
startRun
startScenarioRun
Start a scenario run, which provisions a devbox for the evaluation.
3
pollRun
getScenarioRun
Poll the scenario run until it reaches the scored state, looping while it is running or scoring, and ending the flow on failure, timeout, or cancel.
4
completeRun
completeScenarioRun
Complete the scored run, shutting down the underlying devbox.