Scenarios & Graders
What is a Scenario?
A scenario packages one specific situation you want to put your agent into, plus what “handled it correctly” means for that situation. Think of it like a briefing you’d hand someone walking into a case cold — the state of things right now, what’s unfolding, and what a good resolution looks like:
- Starting state — what the mock services know when the simulation starts (Maria Chen has two $47.50 charges from Bella Cucina on March 5), plus any broader context (a payment-processor outage, a market holiday). This is the world the agent wakes up into.
- Actor — whoever triggers the agent and keeps the interaction going. A persona with objectives (a customer pursuing a refund), an incoming webhook payload, a scheduled job. For multi-turn scenarios the actor drives the conversation toward its objectives until they’re met or the turn budget runs out.
- Success criteria — boolean checks evaluated against the transcript and final service state. They define what “handled correctly” means for this scenario.
One scenario executed end-to-end against your agent is a simulation.
Assertions
Assertions define what “success” means for an individual scenario — two or three independent boolean checks, evaluated after the simulation completes.
Scenario sets
Scenarios are organized into scenario sets — collections generated or uploaded together. veris scenarios create produces a scenario set containing multiple scenarios plus a grader that matches them.
Graders
Where assertions check scenario-specific outcomes, a grader checks general agent behaviors across a scenario set — hallucination, tool-use correctness, communication quality, and similar patterns. Graders read the same simulation trace assertions do; they just score things that aren’t tied to any one scenario.
Generating scenarios
veris scenarios create
veris scenarios create --num 10
veris scenarios status <SET_ID> --watchGeneration analyzes your agent’s code, identifies its capabilities and service integrations, and produces scenarios that exercise different code paths and edge cases — plus a grader tuned to the set.
Scenario types
When generating from the console, you can bias the distribution toward one type:
| Type | What it covers |
|---|---|
| Mixed (default) | A natural distribution across the other types |
| Simple | Straightforward happy-path interactions |
| Complex | Multi-step, multi-service, or multi-actor interactions |
| Error Handling | Service failures, bad data, tool errors the agent needs to recover from |
| Edge Case | Unusual or rare-but-legitimate situations |
| Adversarial | Deceptive, hostile, or agent-breaking actor behavior |
| Out of Scope | Requests the agent shouldn’t or can’t handle |
The CLI’s veris scenarios create today uses the default Mixed distribution.
CLI Commands
# Generate a scenario set and a matching grader
veris scenarios create [--num N] [--env-id ID]
# Check generation progress
veris scenarios status <SET_ID> [--watch]
# List scenario sets
veris scenarios list [--env-id ID]
# Open in console
veris scenarios get <SET_ID>
# Delete a scenario set
veris scenarios delete <SET_ID>Using the console
The Scenarios page lists all scenario sets with their title, status, scenario count, and the grader mapped to the set. Click a scenario set to view individual scenarios and their details.