Skip to Content
Introduction

Veris

Veris runs your AI agent end-to-end against a simulated version of its production environment. That means both sides of the agent:

  • what talks to the agent — a human user chatting with it, a webhook from another system, a scheduled job, or a handoff from another agent. Veris drives any of these with realistic goals, context, and triggering payloads.
  • what the agent talks to — Slack, Stripe, Salesforce, Calendar, Postgres, and more. Veris intercepts outbound API calls and routes them to LLM-powered mocks that respond with scenario-appropriate data.

You push your agent as a container. Veris generates scenarios, drives them through the agent, grades the transcripts, and produces a report. Your agent runs the same code it runs in production. No wrappers, no simulation-only branches.

Agents fail in ways unit tests can’t catch — bad tool calls, wrong decisions mid-task, regressions after a prompt tweak. Veris is the loop for finding and fixing those before they hit production.

The same sandbox is used for the dev loop, CI regression gating, RL and SFT training, and regulatory QA. Integration is the one path you have to walk first; every use case flows from there.

Primitives

  • Environment — your agent packaged as a container, plus the services it’s allowed to talk to.
  • Scenario — a test case: the input that starts the agent (a simulated user with goals, or a triggering event like a webhook or scheduled job), the context it runs in, and what counts as success.
  • Simulation — one scenario executed end-to-end against your agent, in an isolated container.
  • Evaluation — graders that score each simulation transcript against its success criteria.
  • Report — root-cause analysis across a batch of simulations, with concrete fix suggestions.

What the loop looks like

# Package and push your agent veris env push # Generate test cases from your agent's code veris scenarios create # Simulate, evaluate, report veris run

This just shows the shape of the workflow. For a working “run it on my agent” path, see the Quickstart.

Start here

If you haven’t already, open the console  and click an example agent — that’s the fastest “see it work” moment. Once you’ve seen it run there, come back and do the same with your own agent:

Quickstart — point your coding agent at the integration skill, or walk through the config yourself. ~15–30 minutes to your first report.