Skip to Content
EnterpriseCustomer-Hosted Clusters

On-Prem / Customer-Hosted Clusters

Run simulations, scenario generation, and reports on your own Kubernetes cluster (EKS or GKE). Your agent code stays on your infrastructure — Veris manages orchestration and collects results.

How It Works

Customer-hosted architecture

What runs on your cluster: Simulation jobs, scenario generation, and report generation (your agent + Veris sandbox container).

What stays on Veris: Evaluation/grading, artifact storage (GCS), and the API.

Prerequisites

  • A Kubernetes cluster (EKS or GKE)
  • kubectl access to the cluster
  • A container registry accessible from your cluster (ECR, Artifact Registry, or any private registry)
  • A Veris organization with an API key

Infrastructure Requirements

Node Sizing

Veris jobs have different resource requirements. Your cluster must have nodes large enough to schedule them.

Job TypeCPU RequestMemory RequestMemory LimitMinimum Node Size
Scenario generation500m4 Gi16 Gi8 Gi+ RAM (e.g., t3.xlarge)
Report generation500m4 Gi16 Gi8 Gi+ RAM (e.g., t3.xlarge)
Simulation1000m2 Gi4 Gi4 Gi+ RAM (e.g., t3.large)

Scenario generation and report generation are memory-intensive. Nodes with less than 8 Gi of allocatable memory (e.g., t3.medium with ~3.3 Gi allocatable) will fail to schedule these jobs.

Recommended: Use t3.xlarge (16 Gi) or equivalent nodes. This comfortably runs all job types and allows multiple concurrent simulations.

Network / Firewall Rules

Inbound: Your K8s API server must be reachable from Veris on its API port (typically 443). The Veris backend connects to your API server to create and manage jobs.

Outbound: Your cluster nodes need HTTPS (port 443) to the following destinations:

DestinationPurposeRequired
storage.googleapis.comDownload scenario data, upload logs and resultsYes
api.openai.comLLM calls (agent + Veris-internal seed/brief generation)Yes
api.anthropic.comLLM calls (agent + Veris-internal seed/brief generation)Yes
*.openai.azure.comAzure LLM fallback (if configured)If using Azure
api.veris.aiEvent reporting, backend callbacksRecommended
Your container registry (e.g., ECR, Artifact Registry)Pull simulation imagesYes

LLM provider access is required even if your agent doesn’t directly call these APIs. Veris sandbox services use LLM calls internally for seed generation, brief creation, and scenario exploration.

Container Registry

Your agent image must be accessible from your cluster’s nodes. Common setups:

If your EKS cluster and ECR repo are in the same AWS account, image pull works automatically via the node IAM role. No additional configuration needed.

For cross-account ECR, configure imagePullSecrets on the veris namespace’s default ServiceAccount.

Cluster Setup

We provide Terraform modules that create the namespace, RBAC, token, and container registry in one command. Download the main.tf for your provider, configure your variables, and apply.

Download eks-main.tf and save it as main.tf, then:

# Configure cat > terraform.tfvars <<EOF cluster_name = "my-eks-cluster" veris_organization_id = "org_your_org_id" region = "us-east-1" EOF # Apply terraform init && terraform apply # Register with Veris terraform output -raw veris_cluster_registration | \ curl -X POST https://api.veris.ai/v1/clusters \ -H "Authorization: Bearer $VERIS_API_KEY" \ -H "Content-Type: application/json" \ -d @-

Environment Setup

Once your cluster is registered, set up an environment and push your agent image.

Create an environment

veris env create --name my-agent

Note the environment ID (e.g., env_abc123) from the output — you’ll need it for the next step.

Create a container repository for your environment

Each environment gets its own repository path under your image registry.

ECR requires repositories to be created explicitly:

aws ecr create-repository \ --repository-name YOUR_REPO/ENV_ID \ --region YOUR_REGION

For example, if your registry is simulation-images and your environment ID is env_abc123:

aws ecr create-repository \ --repository-name simulation-images/env_abc123 \ --region us-east-1

Push your agent image

The Veris CLI detects that your organization has an external registry and automatically builds locally:

veris env push --tag latest

This will:

  1. Pull the Veris base image using credentials from the API
  2. Build your agent image locally using .veris/Dockerfile.sandbox
  3. Push to your external registry at YOUR_REGISTRY/ENV_ID:latest

ECR users: Make sure you’re authenticated to ECR before pushing:

aws ecr get-login-password --region YOUR_REGION | \ docker login --username AWS --password-stdin YOUR_ACCOUNT_ID.dkr.ecr.YOUR_REGION.amazonaws.com

Managing Your Cluster

Update credentials

When your token expires or you rotate credentials:

curl -X PUT https://api.veris.ai/v1/clusters/CLUSTER_ID \ -H "Authorization: Bearer YOUR_VERIS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"credentials": {"token": "NEW_TOKEN"}}'

The cluster status resets to pending after credential updates. Run the connectivity test again to verify.

Remove a cluster

curl -X DELETE https://api.veris.ai/v1/clusters/CLUSTER_ID \ -H "Authorization: Bearer YOUR_VERIS_API_KEY"

How Jobs Run on Your Cluster

When you trigger a simulation, generation, or report, the Veris backend:

  1. Creates a per-run K8s Secret on your cluster containing short-lived GCS credentials and Veris-internal LLM keys. These are scoped per-run and automatically deleted after completion.
  2. Creates a K8s Job using your registered image. The job manifest is patched to remove Veris-specific scheduling (gVisor, node selectors, tolerations) so it runs on any available node.
  3. An init container downloads scenario data and configuration from Veris GCS using the injected credentials.
  4. Your agent runs inside the simulation container alongside Veris sandbox services (mock services, LLM proxy, engine).
  5. A sidecar container periodically uploads logs, results, and generated artifacts back to Veris GCS.
  6. On completion, the per-run Secret is automatically deleted from your cluster.

Limitations

  • Evaluation and grading run on Veris-managed infrastructure (not your cluster).
  • One cluster per organization. Each Veris organization maps to one registered cluster.
  • Supported providers. Currently EKS and GKE. Other K8s distributions may work with bearer token auth but are not officially supported.
  • GCS-based artifacts. All artifacts (scenarios, logs, results) are stored in Veris-managed GCS buckets.
  • GCS token lifetime. The short-lived GCS credential is valid for 1 hour. Jobs that exceed this window may fail to upload final results.