Skip to Content
ConfigurationCI/CD integration

CI/CD integration

Concrete workflow configs for running Veris in CI. For the strategy (when to gate, how to set thresholds, why to pin your scenario set), see CI/CD regression gating.

The examples on this page are GitHub Actions. The shape is the same on any CI system (GitLab CI, CircleCI, Buildkite, Jenkins, etc.): install the CLI, push an image, run the pipeline, post the result. Translate the YAML to your provider’s syntax.

Anatomy

Every Veris CI job is the same four steps:

  1. Install veris-cli and log in with an API key stored in your CI’s secret store.
  2. Push the agent image to your Veris environment with a CI-specific tag (the PR SHA, the date, etc.).
  3. Run the pipeline with veris run against that tag.
  4. Post the result. veris run writes a markdown summary to stdout and exits non-zero on failure.

Progress logs go to stderr and clean markdown goes to stdout, so you can redirect the report to a file while still seeing progress in CI logs.

GitHub Actions

A complete workflow that runs on every PR, pushes the image tagged with the PR commit SHA, and posts the markdown summary as a sticky PR comment:

.github/workflows/veris.yaml
name: Veris Simulation on: pull_request: types: [opened] branches: [main] permissions: contents: read pull-requests: write env: ENV_ID: "<your-env-id>" SCENARIO_SET_ID: "<your-scenario-set-id>" GRADER_ID: "<your-grader-id>" jobs: simulate: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.11" - name: Install veris-cli run: pip install veris-cli - name: Build & push agent image env: VERIS_API_KEY: ${{ secrets.VERIS_API_KEY }} run: | veris login "$VERIS_API_KEY" veris env push --env-id ${{ env.ENV_ID }} --tag ${{ github.sha }} - name: Run simulation & evaluation run: | veris run \ --scenario-set-id ${{ env.SCENARIO_SET_ID }} \ --env-id ${{ env.ENV_ID }} \ --grader-id ${{ env.GRADER_ID }} \ --image-tag ${{ github.sha }} \ > veris-summary.md - name: Comment on PR uses: marocchino/sticky-pull-request-comment@v2 with: path: veris-summary.md

Grab the three IDs (ENV_ID, SCENARIO_SET_ID, GRADER_ID) locally with veris env list, veris scenarios list, and veris evaluations list.

This workflow triggers on pull_request: [opened]. To re-run on new commits, add synchronize to the types list, or use the Re-run jobs button in the GitHub Actions UI.

Nightly regressions

Same shape as the PR workflow, with two differences: a cron trigger and a date-stamped image tag.

.github/workflows/veris-nightly.yaml
on: schedule: - cron: '0 6 * * *' # 6 AM UTC daily workflow_dispatch: {} # allow manual trigger # ... same permissions, env, and setup steps ... - name: Build & push agent image env: VERIS_API_KEY: ${{ secrets.VERIS_API_KEY }} run: | IMAGE_TAG="nightly-$(date -u +%Y%m%d)" echo "IMAGE_TAG=$IMAGE_TAG" >> "$GITHUB_ENV" veris login "$VERIS_API_KEY" veris env push --env-id ${{ env.ENV_ID }} --tag "$IMAGE_TAG" - name: Run simulation & evaluation run: | veris run \ --scenario-set-id ${{ env.SCENARIO_SET_ID }} \ --env-id ${{ env.ENV_ID }} \ --grader-id ${{ env.GRADER_ID }} \ --image-tag "$IMAGE_TAG"

Results appear on the Benchmarks  page in the console, where you can track trends over time.

Nightly image tags must use the format nightly-YYYYMMDD (e.g. nightly-20260331) for the console to parse them into the trend chart.

Output

The markdown summary written to stdout contains:

  • Run metadata. Run ID, eval run ID, status, scenario set, duration.
  • Results by category. Aggregated pass/fail scores across all simulations.
  • Scenario-level success criteria. Assertion pass rate.
  • Console link. Direct link to the evaluation run.

It’s structured for pasting directly into a PR comment. For the full set of veris run flags, see CLI commands.