Skip to main content

Kubernetes Deployment

Deploy Aurora on any Kubernetes cluster using Helm.

Add the Helm Chart Source

Aurora's Helm chart is published to two registries. Use whichever your cluster tooling supports:

Option 1: Helm repository

helm repo add aurora https://raw.githubusercontent.com/Arvo-AI/aurora/gh-pages
helm repo update
helm search repo aurora

Option 2: OCI registry (GHCR)

helm show values oci://ghcr.io/arvo-ai/charts/aurora-oss > my-values.yaml

Both methods deliver the same chart. Choose OCI if your GitOps tooling (ArgoCD, Flux) already uses OCI, or the traditional repo if you prefer helm repo add.

Prerequisites

You need a Kubernetes cluster with:

  • 4+ CPU cores and 12+ GB RAM allocatable
  • A working default StorageClass (GKE and AKS have this out of the box; EKS needs setup)
  • kubectl connected to the cluster

Don't have a cluster yet? Follow one of these setup guides first:

  • AWS EKS: EKS Cluster Setup for Aurora — includes CSI driver and S3 bucket creation
  • GCP GKE / Azure AKS: Create a cluster with default settings — both include working storage out of the box

Required tools

ToolInstall
kubectlkubernetes.io/docs/tasks/tools
helmhelm.sh/docs/intro/install
yqgithub.com/mikefarah/yq#install
opensslUsually pre-installed. macOS: brew install openssl

Clone the repo

git clone https://github.com/arvo-ai/aurora.git
cd aurora

Step 1: Preflight Check

Verify your cluster is ready:

./deploy/preflight.sh

This checks: kubectl connection, required tools, node resources, StorageClass, CSI driver health (EKS), and ingress controller. Fix any FAIL items before continuing.

Step 2: Prepare S3 Storage & LLM API Key

The deploy script will ask for these. Have them ready.

S3-compatible storage

Aurora stores files in S3-compatible storage. If you followed the EKS setup guide, you already have this.

ProviderEndpoint URLNotes
AWS S3https://s3.amazonaws.comEKS guide covers bucket creation
Cloudflare R2https://<ACCOUNT_ID>.r2.cloudflarestorage.comRegion: auto
GCS (S3 interop)https://storage.googleapis.comCreate HMAC keys
MinIOhttp://minio:9000Self-hosted

LLM API key

Aurora needs an LLM provider for its AI agents. Pick one:

ProviderGet a keyWhat to enter when prompted
OpenRouter (recommended)openrouter.ai/keysProvider: openrouter, Key: sk-or-v1-...
OpenAIplatform.openai.com/api-keysProvider: openai, Key: sk-...
Anthropicconsole.anthropic.comProvider: anthropic, Key: sk-ant-...
Googleaistudio.google.com/apikeyProvider: google, Key: AI...

OpenRouter is recommended — one key gives access to all models (GPT-4o, Claude, Gemini, etc.).

Step 3: Deploy Aurora

# Use prebuilt images from GHCR (no Docker build needed — fastest option)
./deploy/k8s-deploy.sh --skip-build

The script will prompt you for several values. Here's what to enter:

PromptWhat to enter
Container registryghcr.io/arvo-ai
Bucket nameYour S3 bucket name from Step 2
Endpoint URLhttps://s3.amazonaws.com (or your provider's endpoint)
Regionus-east-1 (or your bucket's region)
Access key / Secret keyFrom Step 2 (leave blank when using IRSA/pod identity)
LLM Provideropenrouter (or whichever you chose)
API keyYour key from Step 2
Environmentstaging (or production)

The script will:

  1. Install an ingress controller if missing (default: nginx, but any controller works — see Ingress Controller below)
  2. Detect the ingress IP (resolves AWS ELB hostnames to IPs automatically)
  3. Generate values.generated.yaml with your config + auto-generated secrets
  4. Deploy with Helm
  5. Initialize Vault (init, unseal, KV engine, app token)

After deployment, the script prints the access URLs. Open the frontend URL in your browser.

Option B: Manual Helm deployment

For more control, deploy step by step.

# 1. Add the repo (skip if already added)
helm repo add aurora https://raw.githubusercontent.com/Arvo-AI/aurora/gh-pages
helm repo update

# 2. Pull the default values
helm show values aurora/aurora-oss > values.generated.yaml
# or via OCI: helm show values oci://ghcr.io/arvo-ai/charts/aurora-oss > values.generated.yaml

Edit values.generated.yaml — see Configuration Reference below for all options. At minimum, set:

image:
registry: "ghcr.io/arvo-ai" # or your own registry
tag: "latest"

config:
NEXT_PUBLIC_BACKEND_URL: "http://api.aurora-oss.<IP>.nip.io"
NEXT_PUBLIC_WEBSOCKET_URL: "ws://ws.aurora-oss.<IP>.nip.io"
FRONTEND_URL: "http://aurora-oss.<IP>.nip.io"
STORAGE_BUCKET: "my-bucket"
STORAGE_ENDPOINT_URL: "https://s3.amazonaws.com"
STORAGE_REGION: "us-east-1"

secrets:
db:
POSTGRES_PASSWORD: "" # openssl rand -base64 32
backend:
STORAGE_ACCESS_KEY: "" # optional when using IRSA/pod identity
STORAGE_SECRET_KEY: "" # optional when using IRSA/pod identity
app:
FLASK_SECRET_KEY: "" # openssl rand -base64 32
AUTH_SECRET: "" # openssl rand -base64 32
SEARXNG_SECRET: "" # openssl rand -base64 32
llm:
OPENROUTER_API_KEY: "" # or OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.

ingress:
hosts:
frontend: "aurora-oss.<IP>.nip.io"
api: "api.aurora-oss.<IP>.nip.io"
ws: "ws.aurora-oss.<IP>.nip.io"
Guardrails require an LLM

AI safety guardrails ship on by default (GUARDRAILS_ENABLED: "true" in values.yaml). The LLM judge and input rail fail closed on any LLM error, so every shell command is blocked if the provider is unreachable. Make sure at least one LLM key in the llm secret group is valid before installing the chart, or set GUARDRAILS_ENABLED: "false" in your values overlay. See Command Safety.

# 2. Install an ingress controller if not already installed (see Ingress Controller section below)
# Example: nginx ingress (optional — use any controller that supports the Kubernetes Ingress API)
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml

# 3. Deploy from the published chart
helm upgrade --install aurora-oss aurora/aurora-oss \
--namespace aurora-oss --create-namespace --reset-values \
-f values.generated.yaml

# Or deploy from the OCI registry
helm upgrade --install aurora-oss oci://ghcr.io/arvo-ai/charts/aurora-oss \
--namespace aurora-oss --create-namespace --reset-values \
-f values.generated.yaml

# 4. Set up Vault — see Step 4 below

Using the chart from a local clone

If you prefer to install from a local checkout (e.g., for development or custom chart modifications):

# 1. Create values file
cp deploy/helm/aurora/values.yaml deploy/helm/aurora/values.generated.yaml

Edit values.generated.yaml with the same settings shown above, then:

# 2. Install an ingress controller if not already installed (see Ingress Controller section below)
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml

# 3. Deploy from local chart
helm upgrade --install aurora-oss ./deploy/helm/aurora \
--namespace aurora-oss --create-namespace --reset-values \
-f deploy/helm/aurora/values.generated.yaml

# 4. Set up Vault — see Step 4 below

Building your own images

If you need custom images instead of GHCR prebuilt ones:

# Build and push with make (reads registry from values.generated.yaml)
make deploy

# Or build manually
GIT_SHA=$(git rev-parse --short HEAD)
REGISTRY="your-registry.example.com"

docker buildx build ./server --target=prod --platform linux/amd64 --push \
-t $REGISTRY/aurora-server:$GIT_SHA

docker buildx build ./client --target=prod --platform linux/amd64 --push \
--build-arg NEXT_PUBLIC_BACKEND_URL=http://api.aurora-oss.<IP>.nip.io \
--build-arg NEXT_PUBLIC_WEBSOCKET_URL=ws://ws.aurora-oss.<IP>.nip.io \
-t $REGISTRY/aurora-frontend:$GIT_SHA
NEXT_PUBLIC_* variables are baked at build time

The frontend image must be rebuilt when NEXT_PUBLIC_BACKEND_URL or NEXT_PUBLIC_WEBSOCKET_URL change. Prebuilt GHCR images use runtime injection via env-config.js, so this only applies to custom builds.

Step 4: Vault Setup

See Vault Auto-Unseal with KMS for setup. This eliminates manual unsealing after pod restarts.

For testing: Manual Vault init

If the deploy script handled Vault automatically, you're done. If you need to do it manually, run each command one at a time (copy-pasting multiple heredoc commands at once breaks in zsh):

Credentials file

The deploy script writes vault-init-aurora-oss.txt to the repo root containing the Vault root token and unseal key. This file is gitignored but still exists on disk. Move the contents to a password manager or secure vault, then delete the file.

# Initialize (save the Unseal Key and Root Token!)
kubectl -n aurora-oss exec -it statefulset/aurora-oss-vault -- \
vault operator init -key-shares=1 -key-threshold=1

# Unseal
kubectl -n aurora-oss exec -it statefulset/aurora-oss-vault -- \
vault operator unseal <UNSEAL_KEY>

# Login
kubectl -n aurora-oss exec statefulset/aurora-oss-vault -- sh -c \
'export VAULT_ADDR=http://127.0.0.1:8200 && echo "<ROOT_TOKEN>" | vault login -'

# Enable KV engine
kubectl -n aurora-oss exec statefulset/aurora-oss-vault -- sh -c \
'export VAULT_ADDR=http://127.0.0.1:8200 && vault secrets enable -path=aurora kv-v2'

# Create policy
kubectl -n aurora-oss exec statefulset/aurora-oss-vault -- sh -c \
'export VAULT_ADDR=http://127.0.0.1:8200 && vault policy write aurora-app - <<EOF
path "aurora/data/users/*" { capabilities = ["create","read","update","delete","list"] }
path "aurora/metadata/users/*" { capabilities = ["list","read","delete"] }
path "aurora/metadata/" { capabilities = ["list"] }
path "aurora/metadata/users" { capabilities = ["list"] }
EOF'

# Create app token
kubectl -n aurora-oss exec statefulset/aurora-oss-vault -- sh -c \
'export VAULT_ADDR=http://127.0.0.1:8200 && vault token create -policy=aurora-app -ttl=0'

Put the app token in values.generated.yaml under secrets.backend.VAULT_TOKEN, then:

helm upgrade aurora-oss ./deploy/helm/aurora \
--namespace aurora-oss --reset-values \
-f deploy/helm/aurora/values.generated.yaml
tip

With manual unsealing, you'll need to run vault operator unseal <UNSEAL_KEY> after every Vault pod restart.

Step 5: Verify

# All pods should be Running
kubectl get pods -n aurora-oss

# Check ingress
kubectl get ingress -n aurora-oss

# Test the API
curl http://api.aurora-oss.<IP>.nip.io/health/

Open the frontend URL in your browser. The first user to register becomes the admin.

Ingress Controller

Aurora's Helm chart is controller-agnostic — it uses the standard Kubernetes ingressClassName field. Set ingress.className in your values to match your controller.

Any ingress controller that supports the Kubernetes Ingress API will work. Common options:

ControllerclassNameInstall
NGINX Ingressnginxhelm install ingress-nginx ingress-nginx/ingress-nginx -n ingress-nginx --create-namespace
TraefiktraefikOften bundled with k3s; or helm install traefik traefik/traefik -n traefik --create-namespace
AWS ALBalbInstall AWS Load Balancer Controller
HAProxyhaproxyhelm install haproxy haproxytech/kubernetes-ingress -n haproxy --create-namespace

Required controller settings

Regardless of which controller you use, ensure these are configured:

SettingValueWhy
Request/read timeout3600sRCA analysis can run 30+ minutes
HTTP version1.1Required for WebSocket upgrade
Max body/upload size50mFile uploads

When className is nginx, these are auto-applied as annotations by the Helm chart. For other controllers, configure equivalent settings via ingress.annotations or your controller's configuration.

MCP Ingress

The MCP server (aurora-mcp) runs on port 8811 inside the cluster. By default, it is reachable through the API ingress host at /mcp (e.g., https://api.<domain>/mcp). For local-only access, developers can use kubectl port-forward:

kubectl port-forward svc/aurora-oss-mcp 8811:8811 -n aurora-oss
# MCP endpoint: http://localhost:8811/mcp

To expose MCP on a dedicated hostname, set ingress.hosts.mcp:

ingress:
hosts:
mcp: "mcp.aurora-oss.<IP>.nip.io" # or mcp.yourdomain.com
warning

Enabling a dedicated MCP ingress exposes full platform access to any client with a valid token. The default /mcp path on the API host is already accessible. Always pair with an auth proxy (e.g. oauth2-proxy sidecar or nginx auth_request). See the MCP security guide.

EKS: Fixing nip.io URLs

AWS load balancers return a hostname instead of an IP. The deploy script resolves this automatically, but if your URLs aren't working, see the EKS setup guide or resolve manually:

# Resolve the ELB hostname to an IP (adjust service name/namespace for your controller)
ELB_HOST=$(kubectl get svc -n ingress-nginx ingress-nginx-controller \
-o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
INGRESS_IP=$(dig +short "$ELB_HOST" | head -1)

# Update values
VALUES="deploy/helm/aurora/values.generated.yaml"
yq -i ".ingress.hosts.frontend = \"aurora-oss.${INGRESS_IP}.nip.io\"" "$VALUES"
yq -i ".ingress.hosts.api = \"api.aurora-oss.${INGRESS_IP}.nip.io\"" "$VALUES"
yq -i ".ingress.hosts.ws = \"ws.aurora-oss.${INGRESS_IP}.nip.io\"" "$VALUES"
yq -i ".config.NEXT_PUBLIC_BACKEND_URL = \"http://api.aurora-oss.${INGRESS_IP}.nip.io\"" "$VALUES"
yq -i ".config.NEXT_PUBLIC_WEBSOCKET_URL = \"ws://ws.aurora-oss.${INGRESS_IP}.nip.io\"" "$VALUES"
yq -i ".config.FRONTEND_URL = \"http://aurora-oss.${INGRESS_IP}.nip.io\"" "$VALUES"

# Redeploy
helm upgrade aurora-oss ./deploy/helm/aurora \
--namespace aurora-oss --reset-values -f "$VALUES"
warning

ELB IPs can change. For production, set up real DNS CNAME records pointing to the ELB hostname instead of nip.io.

DNS & TLS

DNS Configuration

For testing, nip.io works without any DNS setup. For production, create DNS records:

aurora.yourdomain.com CNAME <ingress-hostname-or-IP>
api.aurora.yourdomain.com CNAME <ingress-hostname-or-IP>
ws.aurora.yourdomain.com CNAME <ingress-hostname-or-IP>
mcp.aurora.yourdomain.com CNAME <ingress-hostname-or-IP> # optional, see MCP Ingress below

TLS with cert-manager (Let's Encrypt)

# Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.3/cert-manager.yaml
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=cert-manager -n cert-manager --timeout=120s

# Create issuer
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@yourdomain.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
ingressClassName: nginx # Change to match your ingress controller
EOF

Then in values.generated.yaml:

ingress:
tls:
enabled: true
certManager:
enabled: true
issuer: "letsencrypt-prod"
email: "admin@yourdomain.com"

Manual TLS certificate

kubectl create secret tls aurora-tls \
--cert=fullchain.crt --key=privkey.key -n aurora-oss
ingress:
tls:
enabled: true
secretName: "aurora-tls"

Private / VPN Deployment

For deployments where Aurora should only be reachable over a VPN or within your VPC — not exposed to the public internet. This makes the ingress load balancer internal (no public IP); cluster nodes may still have outbound internet access.

# With prebuilt images (nodes must have outbound internet to pull from GHCR)
./deploy/k8s-deploy.sh --private --skip-build

# With custom-built images (for air-gapped clusters or private registries)
./deploy/k8s-deploy.sh --private

The script prompts for a private hostname (e.g. aurora.internal), provisions an internal load balancer, and configures the frontend with your hostname.

Air-gapped clusters

--private only controls the load balancer type — it does not mean the cluster is air-gapped. If your nodes cannot reach the internet, you must build and push images to a private registry accessible from your cluster. Use --private without --skip-build and provide your private registry when prompted.

DNS: Your hostname must resolve on the VPN. Options: split-horizon DNS, Tailscale MagicDNS, or /etc/hosts entries.

Internal LB annotations (applied automatically by the script):

CloudAnnotation
GKEcloud.google.com/load-balancer-type: "Internal"
EKSservice.beta.kubernetes.io/aws-load-balancer-internal: "true"
AKSservice.beta.kubernetes.io/azure-load-balancer-internal: "true"

Local Kubernetes

For local dev on OrbStack, Docker Desktop, or Rancher Desktop:

./deploy/k8s-deploy.sh --local

This skips registry push, enables built-in MinIO for S3 storage, and builds images locally.

Upgrading

# Config-only change (published chart) — pin --version to avoid unintended upgrades
helm upgrade aurora-oss aurora/aurora-oss --version <current-version> \
--reset-values -f values.generated.yaml -n aurora-oss

# Config-only change (local chart)
helm upgrade aurora-oss ./deploy/helm/aurora \
--reset-values -f deploy/helm/aurora/values.generated.yaml -n aurora-oss

# Upgrade to a newer chart version (OCI) — replace <version> with the target release
helm upgrade aurora-oss oci://ghcr.io/arvo-ai/charts/aurora-oss --version <version> \
--reset-values -f values.generated.yaml -n aurora-oss

# Rollback
helm rollback aurora-oss -n aurora-oss

Uninstalling

helm uninstall aurora-oss -n aurora-oss
kubectl delete namespace aurora-oss

Troubleshooting

StatefulSets stuck in Pending

Pods with no events usually mean PVCs can't bind.

kubectl get pvc -n aurora-oss # check PVC status
kubectl get storageclass # is there a default?

EKS: This is almost always a missing storage driver. See the EKS setup guide — specifically the EBS CSI Driver section and its troubleshooting.

After fixing storage, delete stuck PVCs and pods to force recreation:

kubectl delete pvc --all -n aurora-oss
kubectl delete pods --all -n aurora-oss

Frontend loads but API returns 403

  1. Check backend is reachable: curl http://api.aurora-oss.<IP>.nip.io/health/
  2. Check URLs in frontend config: curl http://aurora-oss.<IP>.nip.io/env-config.js
  3. Check backend logs: kubectl logs -n aurora-oss deploy/aurora-oss-server --tail=20

If you see RBAC denied, restart services:

kubectl rollout restart deployment/aurora-oss-server \
deployment/aurora-oss-celery-worker \
deployment/aurora-oss-chatbot -n aurora-oss

Vault sealed after restart

Without KMS auto-unseal, Vault seals on every pod restart:

kubectl -n aurora-oss exec -it statefulset/aurora-oss-vault -- \
vault operator unseal <UNSEAL_KEY>

For production, set up KMS auto-unseal.

AWS quota limits

See EKS Setup — Troubleshooting.

Image pull errors

kubectl get events -n aurora-oss --sort-by='.lastTimestamp'

If using GHCR prebuilt images, no registry auth is needed (images are public).

Frontend CrashLoopBackOff

If logs show Permission denied on env-config.js, ensure the Dockerfile sets ownership:

kubectl logs -n aurora-oss deploy/aurora-oss-frontend --tail=10

Configuration Reference

Object Storage

Aurora requires S3-compatible storage. Options:

ProviderSTORAGE_ENDPOINT_URLNotes
AWS S3https://s3.amazonaws.comCreate bucket + IAM user
MinIOhttp://minio:9000Self-hosted, set STORAGE_USE_SSL: "false"
Cloudflare R2https://<ACCOUNT_ID>.r2.cloudflarestorage.comSet region to auto
GCS (S3 interop)https://storage.googleapis.comCreate HMAC keys

Internal Service Discovery

These are auto-generated by Helm if left empty:

  • POSTGRES_HOST<release>-postgres
  • REDIS_URLredis://<release>-redis:6379/0
  • WEAVIATE_HOST<release>-weaviate
  • VAULT_ADDRhttp://<release>-vault:8200

Leave them empty unless using external managed services.

All Values

See deploy/helm/aurora/values.yaml for the complete list of configuration options.

Using Pre-Existing Kubernetes Secrets

By default, the chart creates Kubernetes Secrets from values in values.yaml. For production deployments where secrets are managed externally (via Terraform, External Secrets Operator, Sealed Secrets, or manual kubectl create secret), you can point each secret group to a pre-existing Kubernetes Secret instead.

Set existingSecret on any of the four secret groups to skip chart-managed secret creation for that group:

secrets:
db:
existingSecret: "my-db-secret"
backend:
existingSecret: "my-backend-secret"
app:
existingSecret: "my-app-secret"
llm:
existingSecret: "my-llm-secret"

You can mix and match -- use existingSecret for some groups and inline values for others.

Required keys per group:

GroupRequired Keys
dbPOSTGRES_USER, POSTGRES_PASSWORD
backendVAULT_TOKEN (if using Vault). STORAGE_ACCESS_KEY, STORAGE_SECRET_KEY are optional when using IRSA/pod identity.
appFLASK_SECRET_KEY, AUTH_SECRET, SEARXNG_SECRET
llmAt least one of: OPENROUTER_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_AI_API_KEY

Example: Creating the secrets before installing the chart:

kubectl create secret generic my-db-secret -n aurora-oss \
--from-literal=POSTGRES_USER=aurora \
--from-literal=POSTGRES_PASSWORD="$(openssl rand -base64 32)"

kubectl create secret generic my-backend-secret -n aurora-oss \
--from-literal=VAULT_TOKEN="your-vault-token" \
--from-literal=STORAGE_ACCESS_KEY="your-access-key" \
--from-literal=STORAGE_SECRET_KEY="your-secret-key"

The external secrets must exist in the target namespace before running helm install.