Skip to Content
Core ConceptsScenarios & Graders

Scenarios & Graders

What is a Scenario?

A scenario packages one specific situation you want to put your agent into, plus what “handled it correctly” means for that situation. Think of it like a briefing you’d hand someone walking into a case cold — the state of things right now, what’s unfolding, and what a good resolution looks like:

  • Starting state — what the mock services know when the simulation starts (Maria Chen has two $47.50 charges from Bella Cucina on March 5), plus any broader context (a payment-processor outage, a market holiday). This is the world the agent wakes up into.
  • Actor — whoever triggers the agent and keeps the interaction going. A persona with objectives (a customer pursuing a refund), an incoming webhook payload, a scheduled job. For multi-turn scenarios the actor drives the conversation toward its objectives until they’re met or the turn budget runs out.
  • Success criteria — boolean checks evaluated against the transcript and final service state. They define what “handled correctly” means for this scenario.

One scenario executed end-to-end against your agent is a simulation.

Assertions

Assertions define what “success” means for an individual scenario — two or three independent boolean checks, evaluated after the simulation completes.

Scenario sets

Scenarios are organized into scenario sets — collections generated or uploaded together. veris scenarios create produces a scenario set containing multiple scenarios plus a grader that matches them.

Graders

Where assertions check scenario-specific outcomes, a grader checks general agent behaviors across a scenario set — hallucination, tool-use correctness, communication quality, and similar patterns. Graders read the same simulation trace assertions do; they just score things that aren’t tied to any one scenario.

Generating scenarios

veris scenarios create veris scenarios create --num 10 veris scenarios status <SET_ID> --watch

Generation analyzes your agent’s code, identifies its capabilities and service integrations, and produces scenarios that exercise different code paths and edge cases — plus a grader tuned to the set.

Scenario types

When generating from the console, you can bias the distribution toward one type:

TypeWhat it covers
Mixed (default)A natural distribution across the other types
SimpleStraightforward happy-path interactions
ComplexMulti-step, multi-service, or multi-actor interactions
Error HandlingService failures, bad data, tool errors the agent needs to recover from
Edge CaseUnusual or rare-but-legitimate situations
AdversarialDeceptive, hostile, or agent-breaking actor behavior
Out of ScopeRequests the agent shouldn’t or can’t handle

The CLI’s veris scenarios create today uses the default Mixed distribution.

CLI Commands

# Generate a scenario set and a matching grader veris scenarios create [--num N] [--env-id ID] # Check generation progress veris scenarios status <SET_ID> [--watch] # List scenario sets veris scenarios list [--env-id ID] # Open in console veris scenarios get <SET_ID> # Delete a scenario set veris scenarios delete <SET_ID>

Using the console

The Scenarios page lists all scenario sets with their title, status, scenario count, and the grader mapped to the set. Click a scenario set to view individual scenarios and their details.