On-Prem / Customer-Hosted Clusters

Run simulations, scenario generation, and reports on your own Kubernetes cluster (EKS or GKE). Your agent code stays on your infrastructure — Veris manages orchestration and collects results.

How It Works

Customer-hosted architecture

What runs on your cluster: Simulation jobs, scenario generation, and report generation (your agent + Veris sandbox container).

What stays on Veris: Evaluation/grading, artifact storage (GCS), and the API.

Prerequisites

A Kubernetes cluster (EKS or GKE)
kubectl access to the cluster
A container registry accessible from your cluster (ECR, Artifact Registry, or any private registry)
A Veris organization with an API key

Infrastructure Requirements

Node Sizing

Veris jobs have different resource requirements. Your cluster must have nodes large enough to schedule them.

Job Type	CPU Request	Memory Request	Memory Limit	Minimum Node Size
Scenario generation	500m	4 Gi	16 Gi	8 Gi+ RAM (e.g., `t3.xlarge`)
Report generation	500m	4 Gi	16 Gi	8 Gi+ RAM (e.g., `t3.xlarge`)
Simulation	1000m	2 Gi	4 Gi	4 Gi+ RAM (e.g., `t3.large`)

Scenario generation and report generation are memory-intensive. Nodes with less than 8 Gi of allocatable memory (e.g., t3.medium with ~3.3 Gi allocatable) will fail to schedule these jobs.

Recommended: Use t3.xlarge (16 Gi) or equivalent nodes. This comfortably runs all job types and allows multiple concurrent simulations.

Network / Firewall Rules

Inbound: Your K8s API server must be reachable from Veris on its API port (typically 443). The Veris backend connects to your API server to create and manage jobs.

Outbound: Your cluster nodes need HTTPS (port 443) to the following destinations:

Destination	Purpose	Required
`storage.googleapis.com`	Download scenario data, upload logs and results	Yes
`api.openai.com`	LLM calls (agent + Veris-internal seed/brief generation)	Yes
`api.anthropic.com`	LLM calls (agent + Veris-internal seed/brief generation)	Yes
`*.openai.azure.com`	Azure LLM fallback (if configured)	If using Azure
`api.veris.ai`	Event reporting, backend callbacks	Recommended
Your container registry (e.g., ECR, Artifact Registry)	Pull simulation images	Yes

LLM provider access is required even if your agent doesn’t directly call these APIs. Veris sandbox services use LLM calls internally for seed generation, brief creation, and scenario exploration.

Container Registry

Your agent image must be accessible from your cluster’s nodes. Common setups:

ECR (AWS)

If your EKS cluster and ECR repo are in the same AWS account, image pull works automatically via the node IAM role. No additional configuration needed.

For cross-account ECR, configure imagePullSecrets on the veris namespace’s default ServiceAccount.

Other

For private registries, create an imagePullSecret in the veris namespace and attach it to the default ServiceAccount:


kubectl create secret docker-registry regcred \
  --docker-server=YOUR_REGISTRY \
  --docker-username=YOUR_USER \
  --docker-password=YOUR_PASSWORD \
  -n veris
 
kubectl patch serviceaccount default -n veris \
  -p '{"imagePullSecrets": [{"name": "regcred"}]}'

Cluster Setup

Automated (Terraform)

We provide Terraform modules that create the namespace, RBAC, token, and container registry in one command. Download the main.tf for your provider, configure your variables, and apply.

EKS

Download eks-main.tf and save it as main.tf, then:


# Configure
cat > terraform.tfvars <<EOF
cluster_name          = "my-eks-cluster"
veris_organization_id = "org_your_org_id"
region                = "us-east-1"
EOF
 
# Apply
terraform init && terraform apply
 
# Register with Veris
terraform output -raw veris_cluster_registration | \
  curl -X POST https://api.veris.ai/v1/clusters \
    -H "Authorization: Bearer $VERIS_API_KEY" \
    -H "Content-Type: application/json" \
    -d @-

Manual

Create the Veris namespace


kubectl create namespace veris

Create a ServiceAccount with RBAC

The Veris backend needs permissions to create and manage jobs in your cluster.


kubectl create serviceaccount veris-sa -n veris
 
kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: veris-job-manager
  namespace: veris
rules:
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["create", "get", "list", "delete", "watch"]
- apiGroups: [""]
  resources: ["pods", "pods/log"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["secrets", "configmaps"]
  verbs: ["create", "get", "delete", "list"]
- apiGroups: [""]
  resources: ["namespaces"]
  verbs: ["get"]
EOF
 
kubectl create rolebinding veris-sa-binding \
  --role=veris-job-manager \
  --serviceaccount=veris:veris-sa \
  -n veris

Generate an authentication token

Long-lived token (recommended)

Create a Kubernetes Secret-bound token that does not expire:


kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: veris-sa-token
  namespace: veris
  annotations:
    kubernetes.io/service-account.name: veris-sa
type: kubernetes.io/service-account-token
EOF
 
# Retrieve the token
kubectl get secret veris-sa-token -n veris \
  -o jsonpath='{.data.token}' | base64 -d

Get your cluster endpoint and CA certificate

EKS


aws eks describe-cluster --name YOUR_CLUSTER_NAME --region YOUR_REGION \
  --query 'cluster.{endpoint: endpoint, ca: certificateAuthority.data}' \
  --output json

Decode the CA certificate:


echo "BASE64_CA_DATA" | base64 -d

Register the cluster with Veris

Include your external image registry URL so Veris knows where your agent images are stored.


curl -X POST https://api.veris.ai/v1/clusters \
  -H "Authorization: Bearer YOUR_VERIS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "organization_id": "YOUR_ORG_ID",
    "name": "my-production-cluster",
    "provider": "eks",
    "api_server_url": "https://YOUR_CLUSTER_ENDPOINT",
    "ca_certificate": "-----BEGIN CERTIFICATE-----\n...\n-----END CERTIFICATE-----",
    "namespace": "veris",
    "auth_type": "token",
    "credentials": {
      "token": "YOUR_SA_TOKEN"
    },
    "image_registry_url": "YOUR_ACCOUNT_ID.dkr.ecr.YOUR_REGION.amazonaws.com/YOUR_REPO"
  }'

The image_registry_url tells Veris where your agent images live. For ECR, this is the repository URI (e.g., 011984682167.dkr.ecr.us-east-1.amazonaws.com/simulation-images). For Artifact Registry, use the full repository path (e.g., us-central1-docker.pkg.dev/my-project/my-repo).

Test connectivity


curl -X POST https://api.veris.ai/v1/clusters/CLUSTER_ID/test \
  -H "Authorization: Bearer YOUR_VERIS_API_KEY"

You should see:


{
  "connected": true,
  "message": "Connected to eks cluster, namespace 'veris' exists",
  "server_version": "1.34"
}

Environment Setup

Once your cluster is registered, set up an environment and push your agent image.

Create an environment


veris env create --name my-agent

Note the environment ID (e.g., env_abc123) from the output — you’ll need it for the next step.

Create a container repository for your environment

Each environment gets its own repository path under your image registry.

ECR

ECR requires repositories to be created explicitly:


aws ecr create-repository \
  --repository-name YOUR_REPO/ENV_ID \
  --region YOUR_REGION

For example, if your registry is simulation-images and your environment ID is env_abc123:


aws ecr create-repository \
  --repository-name simulation-images/env_abc123 \
  --region us-east-1

Push your agent image

The Veris CLI detects that your organization has an external registry and automatically builds locally:


veris env push --tag latest

This will:

Pull the Veris base image using credentials from the API
Build your agent image locally using .veris/Dockerfile.sandbox
Push to your external registry at YOUR_REGISTRY/ENV_ID:latest

ECR users: Make sure you’re authenticated to ECR before pushing:


aws ecr get-login-password --region YOUR_REGION | \
  docker login --username AWS --password-stdin YOUR_ACCOUNT_ID.dkr.ecr.YOUR_REGION.amazonaws.com

Managing Your Cluster

Update credentials

When your token expires or you rotate credentials:


curl -X PUT https://api.veris.ai/v1/clusters/CLUSTER_ID \
  -H "Authorization: Bearer YOUR_VERIS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"credentials": {"token": "NEW_TOKEN"}}'

The cluster status resets to pending after credential updates. Run the connectivity test again to verify.

Remove a cluster


curl -X DELETE https://api.veris.ai/v1/clusters/CLUSTER_ID \
  -H "Authorization: Bearer YOUR_VERIS_API_KEY"

How Jobs Run on Your Cluster

When you trigger a simulation, generation, or report, the Veris backend:

Creates a per-run K8s Secret on your cluster containing short-lived GCS credentials and Veris-internal LLM keys. These are scoped per-run and automatically deleted after completion.
Creates a K8s Job using your registered image. The job manifest is patched to remove Veris-specific scheduling (gVisor, node selectors, tolerations) so it runs on any available node.
An init container downloads scenario data and configuration from Veris GCS using the injected credentials.
Your agent runs inside the simulation container alongside Veris sandbox services (mock services, LLM proxy, engine).
A sidecar container periodically uploads logs, results, and generated artifacts back to Veris GCS.
On completion, the per-run Secret is automatically deleted from your cluster.

Limitations

Evaluation and grading run on Veris-managed infrastructure (not your cluster).
One cluster per organization. Each Veris organization maps to one registered cluster.
Supported providers. Currently EKS and GKE. Other K8s distributions may work with bearer token auth but are not officially supported.
GCS-based artifacts. All artifacts (scenarios, logs, results) are stored in Veris-managed GCS buckets.
GCS token lifetime. The short-lived GCS credential is valid for 1 hour. Jobs that exceed this window may fail to upload final results.