Introduction
Veris is a simulation platform for testing and training AI agents.
It provides a complete simulation sandbox to help you develop, test, and train high-quality AI agents that are optimized for automating a task. Veris environments come pre-loaded with simulated users and simulated versions of common services and SaaS platforms — your agent makes real API calls to LLM-powered mocks without any code changes.
Agent-Native Development
Traditional software development relies on unit tests and staging environments. But AI agents are fundamentally different — they make decisions, use tools, and have conversations. They need to be trained and tested in an environment that closely mirrors their final production setup.
Veris enables agent-native development: you build, test, and improve your agent inside a simulation environment that replicates the real world. The same services your agent will call in production (Salesforce, Google Calendar, Stripe, Jira, Slack, and more) are available as intelligent mocks that understand context and generate realistic data. A simulated user with specific goals drives the conversation, just like a real customer would.
This approach means every iteration happens in conditions that match production — so the behaviors you see during development are the behaviors you get in deployment.
The Workflow
One-time setup
- Setup — Install the CLI, configure
veris.yaml, package your agent, and push to Veris - Generate scenarios & graders — AI reads your agent’s code and creates test cases (regenerate when services or integrations change)
How to Use the Sandbox
- Development Loop — Iterative simulate, evaluate, report, and fix cycle
- CI/CD & Regression Testing — Automated quality gates in your deployment pipeline
- Training: Reinforcement Learning — Use simulations as a live training ground with reward signals
- Training: Supervised Fine-Tuning — Fine-tune models on high-scoring simulation transcripts
Development Loop
Simulate → Evaluate → Report → Fix → Simulate again. Each iteration improves your agent based on root cause analysis and actionable recommendations.
CI/CD & Regression Testing
Integrate Veris into your CI/CD pipeline to catch regressions before they ship. Every push triggers a simulation suite, evaluations compare against your baseline, and a quality gate blocks deploys that fall below your threshold.
Training: Reinforcement Learning
The simulation environment serves as a live training ground. Graders and assertions provide reward signals, and the model is updated to favor higher-scoring behaviors.
Training: Supervised Fine-Tuning
High-scoring simulation transcripts become supervised training data. Fine-tune a base model on correct agent behavior for your specific task domain.
Key Concepts
| Concept | What It Is |
|---|---|
| Environment | An isolated sandbox that simulates the production environment for an AI agent |
| Service | A simulated API (Salesforce, Calendar, Stripe, etc.) that your agent calls |
| Scenario | A test case defining actors, objectives, context, and assertions |
| Simulation | A single test run — one scenario executed in an isolated container |
| Run | A collection of simulations (e.g., a scenario set run at scale) |
| Grader | An evaluation function that scores agent behavior from transcripts |
| Evaluation | Results of graders running against simulation transcripts |
| Report | Root cause analysis with actionable recommendations and auto-apply |
| Training | Fine-tune models using simulation data (SFT) or environments (RL) |
Getting Started
- Quickstart — Get your first simulation running in under 10 minutes
- Templates — Ready-to-use
veris.yamltemplates for common setups - Full Walkthrough — Comprehensive guide explaining every step and concept
- CLI Installation — Install and authenticate the Veris CLI