Skip to Content
Core ConceptsSimulations & Runs

Simulations & Runs

What is a simulation?

A simulation is one scenario running against your agent inside an isolated container. It produces a transcript — a record of the actor’s messages, your agent’s responses, every API call made to services, and the actor’s internal reasoning. The transcript is what gets graded afterward.

During a simulation a fresh container starts with your agent plus its declared services; the services seed data based on the scenario; the actor initiates; your agent responds (potentially calling services); and they continue until the actor’s objectives are met or the turn budget runs out.

Set actor.config.MAX_TURNS in veris.yaml to control the turn budget. For one-shot agents, set it to 1.

What is a run?

A run is a batch of simulations. When you start a run, every scenario in the selected scenario set becomes one simulation, and they execute in parallel.

Creating a run

Interactive:

veris simulations create

Prompts for an environment and scenario set.

With flags:

veris simulations create \ --scenario-set-id scenset_abc123 \ --env-id env_xyz \ --simulation-timeout 300
FlagDescription
--scenario-set-idScenario set to run
--env-idEnvironment to use
--image-tagImage tag (default: latest)
--simulation-timeoutPer-simulation timeout in seconds (60–3600)
--auto-evaluate / --no-auto-evaluateAuto-run the grader once simulations finish (default: on)

With --auto-evaluate on (the default), the run’s grader starts as soon as the last simulation completes — no separate veris evaluations create step needed.

Monitoring progress

# One-time status veris simulations status <RUN_ID> # Watch mode (polls every 3 seconds) veris simulations status <RUN_ID> --watch # Include the event stream veris simulations status <RUN_ID> --watch --log

In the console, the Simulations page shows each run’s progress bar, per-simulation status, and timing. Click into a simulation to see the full transcript: messages between the actor and agent, the actor’s internal reasoning, every API call with request/response payloads, and the raw events.

Cancelling

veris simulations cancel <RUN_ID>

Stops all pending and running simulations in the run.

CLI Commands

# Create a run veris simulations create [--scenario-set-id ID] [--env-id ID] # List runs veris simulations list [--status STATUS] [--env-id ID] # Check status veris simulations status <RUN_ID> [--watch] [--log] # Cancel veris simulations cancel <RUN_ID>