Simulations & Runs
What is a simulation?
A simulation is one scenario running against your agent inside an isolated container. It produces a transcript — a record of the actor’s messages, your agent’s responses, every API call made to services, and the actor’s internal reasoning. The transcript is what gets graded afterward.
During a simulation a fresh container starts with your agent plus its declared services; the services seed data based on the scenario; the actor initiates; your agent responds (potentially calling services); and they continue until the actor’s objectives are met or the turn budget runs out.
Set actor.config.MAX_TURNS in veris.yaml to control the turn budget. For one-shot agents, set it to 1.
What is a run?
A run is a batch of simulations. When you start a run, every scenario in the selected scenario set becomes one simulation, and they execute in parallel.
Creating a run
Interactive:
veris simulations createPrompts for an environment and scenario set.
With flags:
veris simulations create \
--scenario-set-id scenset_abc123 \
--env-id env_xyz \
--simulation-timeout 300| Flag | Description |
|---|---|
--scenario-set-id | Scenario set to run |
--env-id | Environment to use |
--image-tag | Image tag (default: latest) |
--simulation-timeout | Per-simulation timeout in seconds (60–3600) |
--auto-evaluate / --no-auto-evaluate | Auto-run the grader once simulations finish (default: on) |
With --auto-evaluate on (the default), the run’s grader starts as soon as the last simulation completes — no separate veris evaluations create step needed.
Monitoring progress
# One-time status
veris simulations status <RUN_ID>
# Watch mode (polls every 3 seconds)
veris simulations status <RUN_ID> --watch
# Include the event stream
veris simulations status <RUN_ID> --watch --logIn the console, the Simulations page shows each run’s progress bar, per-simulation status, and timing. Click into a simulation to see the full transcript: messages between the actor and agent, the actor’s internal reasoning, every API call with request/response payloads, and the raw events.
Cancelling
veris simulations cancel <RUN_ID>Stops all pending and running simulations in the run.
CLI Commands
# Create a run
veris simulations create [--scenario-set-id ID] [--env-id ID]
# List runs
veris simulations list [--status STATUS] [--env-id ID]
# Check status
veris simulations status <RUN_ID> [--watch] [--log]
# Cancel
veris simulations cancel <RUN_ID>