Scenarios & Graders

What is a Scenario?

A scenario packages one specific situation you want to put your agent into, plus what “handled it correctly” means for that situation. Think of it like a briefing you’d hand someone walking into a case cold — the state of things right now, what’s unfolding, and what a good resolution looks like:

Starting state — what the mock services know when the simulation starts (Maria Chen has two $47.50 charges from Bella Cucina on March 5), plus any broader context (a payment-processor outage, a market holiday). This is the world the agent wakes up into.
Actor — whoever triggers the agent and keeps the interaction going. A persona with objectives (a customer pursuing a refund), an incoming webhook payload, a scheduled job. For multi-turn scenarios the actor drives the conversation toward its objectives until they’re met or the turn budget runs out.
Success criteria — boolean checks evaluated against the transcript and final service state. They define what “handled correctly” means for this scenario.

One scenario executed end-to-end against your agent is a simulation.

Assertions

Assertions define what “success” means for an individual scenario — two or three independent boolean checks, evaluated after the simulation completes.

Scenario sets

Scenarios are organized into scenario sets — collections generated or uploaded together. veris scenarios create produces a scenario set containing multiple scenarios plus a grader that matches them.

Graders

Where assertions check scenario-specific outcomes, a grader checks general agent behaviors across a scenario set — hallucination, tool-use correctness, communication quality, and similar patterns. Graders read the same simulation trace assertions do; they just score things that aren’t tied to any one scenario.

Generating scenarios


veris scenarios create
veris scenarios create --num 10
veris scenarios status <SET_ID> --watch

Generation analyzes your agent’s code, identifies its capabilities and service integrations, and produces scenarios that exercise different code paths and edge cases — plus a grader tuned to the set.

Scenario types

When generating from the console, you can bias the distribution toward one type:

Type	What it covers
Mixed (default)	A natural distribution across the other types
Simple	Straightforward happy-path interactions
Complex	Multi-step, multi-service, or multi-actor interactions
Error Handling	Service failures, bad data, tool errors the agent needs to recover from
Edge Case	Unusual or rare-but-legitimate situations
Adversarial	Deceptive, hostile, or agent-breaking actor behavior
Out of Scope	Requests the agent shouldn’t or can’t handle

The CLI’s veris scenarios create today uses the default Mixed distribution.

CLI Commands


# Generate a scenario set and a matching grader
veris scenarios create [--num N] [--env-id ID]
 
# Check generation progress
veris scenarios status <SET_ID> [--watch]
 
# List scenario sets
veris scenarios list [--env-id ID]
 
# Open in console
veris scenarios get <SET_ID>
 
# Delete a scenario set
veris scenarios delete <SET_ID>

Using the console

The Scenarios page lists all scenario sets with their title, status, scenario count, and the grader mapped to the set. Click a scenario set to view individual scenarios and their details.