CI/CD integration
Concrete workflow configs for running Veris in CI. For the strategy (when to gate, how to set thresholds, why to pin your scenario set), see CI/CD regression gating.
The examples on this page are GitHub Actions. The shape is the same on any CI system (GitLab CI, CircleCI, Buildkite, Jenkins, etc.): install the CLI, push an image, run the pipeline, post the result. Translate the YAML to your provider’s syntax.
Anatomy
Every Veris CI job is the same four steps:
- Install
veris-cliand log in with an API key stored in your CI’s secret store. - Push the agent image to your Veris environment with a CI-specific tag (the PR SHA, the date, etc.).
- Run the pipeline with
veris runagainst that tag. - Post the result.
veris runwrites a markdown summary to stdout and exits non-zero on failure.
Progress logs go to stderr and clean markdown goes to stdout, so you can redirect the report to a file while still seeing progress in CI logs.
GitHub Actions
A complete workflow that runs on every PR, pushes the image tagged with the PR commit SHA, and posts the markdown summary as a sticky PR comment:
name: Veris Simulation
on:
pull_request:
types: [opened]
branches: [main]
permissions:
contents: read
pull-requests: write
env:
ENV_ID: "<your-env-id>"
SCENARIO_SET_ID: "<your-scenario-set-id>"
GRADER_ID: "<your-grader-id>"
jobs:
simulate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install veris-cli
run: pip install veris-cli
- name: Build & push agent image
env:
VERIS_API_KEY: ${{ secrets.VERIS_API_KEY }}
run: |
veris login "$VERIS_API_KEY"
veris env push --env-id ${{ env.ENV_ID }} --tag ${{ github.sha }}
- name: Run simulation & evaluation
run: |
veris run \
--scenario-set-id ${{ env.SCENARIO_SET_ID }} \
--env-id ${{ env.ENV_ID }} \
--grader-id ${{ env.GRADER_ID }} \
--image-tag ${{ github.sha }} \
> veris-summary.md
- name: Comment on PR
uses: marocchino/sticky-pull-request-comment@v2
with:
path: veris-summary.mdGrab the three IDs (ENV_ID, SCENARIO_SET_ID, GRADER_ID) locally with veris env list, veris scenarios list, and veris evaluations list.
This workflow triggers on pull_request: [opened]. To re-run on new commits, add synchronize to the types list, or use the Re-run jobs button in the GitHub Actions UI.
Nightly regressions
Same shape as the PR workflow, with two differences: a cron trigger and a date-stamped image tag.
on:
schedule:
- cron: '0 6 * * *' # 6 AM UTC daily
workflow_dispatch: {} # allow manual trigger
# ... same permissions, env, and setup steps ...
- name: Build & push agent image
env:
VERIS_API_KEY: ${{ secrets.VERIS_API_KEY }}
run: |
IMAGE_TAG="nightly-$(date -u +%Y%m%d)"
echo "IMAGE_TAG=$IMAGE_TAG" >> "$GITHUB_ENV"
veris login "$VERIS_API_KEY"
veris env push --env-id ${{ env.ENV_ID }} --tag "$IMAGE_TAG"
- name: Run simulation & evaluation
run: |
veris run \
--scenario-set-id ${{ env.SCENARIO_SET_ID }} \
--env-id ${{ env.ENV_ID }} \
--grader-id ${{ env.GRADER_ID }} \
--image-tag "$IMAGE_TAG"Results appear on the Benchmarks page in the console, where you can track trends over time.
Nightly image tags must use the format nightly-YYYYMMDD (e.g. nightly-20260331) for the console to parse them into the trend chart.
Output
The markdown summary written to stdout contains:
- Run metadata. Run ID, eval run ID, status, scenario set, duration.
- Results by category. Aggregated pass/fail scores across all simulations.
- Scenario-level success criteria. Assertion pass rate.
- Console link. Direct link to the evaluation run.
It’s structured for pasting directly into a PR comment. For the full set of veris run flags, see CLI commands.