Skip to main content

LLM Providers

Aurora requires an LLM provider for its AI-powered investigation and Root Cause Analysis (RCA) capabilities. You can use a single gateway like OpenRouter, connect directly to individual providers, or run models locally with Ollama.

Supported Providers

ProviderModeEnvironment VariableGet API Key
OpenRouterGatewayOPENROUTER_API_KEYopenrouter.ai/keys
OpenAIDirectOPENAI_API_KEYplatform.openai.com
AnthropicDirectANTHROPIC_API_KEYconsole.anthropic.com
Google AIDirectGOOGLE_AI_API_KEYai.google.dev
Vertex AIDirectVERTEX_AI_PROJECT + credentialsconsole.cloud.google.com
OllamaDirectOLLAMA_BASE_URLollama.com (free, local)
AWS BedrockDirectBEDROCK_BASE_URL (gateway) or BEDROCK_REGION (native)aws.amazon.com/bedrock

Only one provider is required.

Provider Modes

Aurora supports three routing modes, controlled by LLM_PROVIDER_MODE:

OpenRouter Mode (default)

Routes all LLM requests through OpenRouter, giving you access to multiple model providers with a single API key.

LLM_PROVIDER_MODE=openrouter
OPENROUTER_API_KEY=sk-or-v1-...

Direct Mode

Connects directly to each provider's native API. Use this when running models locally with Ollama or when you prefer direct API access.

LLM_PROVIDER_MODE=direct

In direct mode, Aurora auto-detects the provider from the model name prefix (e.g., anthropic/claude-3-haiku routes to Anthropic, google/gemini-2.5-flash routes to Google AI).

Provider Mode (route everything through one provider)

Set LLM_PROVIDER_MODE to a provider name to send every model pick through that one backend — useful when a deployment standardizes on a single provider (e.g. a customer running entirely on AWS Bedrock):

LLM_PROVIDER_MODE=bedrock # also accepts: vertex, anthropic, openai, google, ollama

A clean pick like Claude Opus 4.7 is then translated to that provider's native id automatically (us.anthropic.claude-opus-4-7 on Bedrock) — no bedrock/ prefix or per-model setup. Models the provider can't serve (e.g. Gemini under bedrock) fall back to their own provider. It's the same idea as openrouter mode, pointed at a single direct provider.

Supported Models

ProviderModelNotes
OpenAIopenai/gpt-5.4Latest flagship, 1M context
openai/gpt-5.2Previous flagship
openai/o3Strong reasoning model
openai/o4-miniFast reasoning, lower cost
openai/o3-miniCompact reasoning model
openai/gpt-4.1Reliable all-rounder
openai/gpt-4.1-miniFast and affordable
openai/gpt-4oMultimodal (text + vision)
openai/gpt-4o-miniCheapest OpenAI option
Anthropicanthropic/claude-opus-4.6Most capable, 1M context
anthropic/claude-sonnet-4.6Near Opus quality at lower cost
anthropic/claude-opus-4.5Previous generation flagship
anthropic/claude-sonnet-4.5Balanced quality and speed
anthropic/claude-haiku-4.5Fast, affordable
anthropic/claude-3.5-sonnetWidely used, reliable
anthropic/claude-3-haikuCheapest (default RCA model)
Google Geminigoogle/gemini-3.1-pro-previewLatest flagship with thinking
google/gemini-3-flash-previewFast, outperforms 2.5 Pro
google/gemini-2.5-proStrong for complex tasks
google/gemini-2.5-flashCost-effective
google/gemini-2.5-flash-liteCheapest Gemini option
Vertex AIvertex/gemini-3.1-pro-previewLatest flagship with thinking
vertex/gemini-3-flash-previewFast, enterprise-grade
vertex/gemini-2.5-proStrong for complex tasks
vertex/gemini-2.5-flashCost-effective with IAM auth
vertex/gemini-2.5-flash-liteCheapest Vertex option
Ollamaollama/llama3.1Meta's Llama 3.1 (8B/70B)
ollama/qwen2.5Alibaba's Qwen 2.5 (various sizes)
Any model via ollama pull
AWS Bedrockbedrock/us.anthropic.claude-sonnet-4-5-v1:0Native mode: a Bedrock inference-profile id (region-prefixed us./eu./apac.)
bedrock/us.anthropic.claude-haiku-4-5-v1:0Faster, cheaper Claude on Bedrock
Gateway: the model name your gateway expectsGateway mode passes the suffix through to your OpenAI-compatible endpoint

Model names use the provider/model format. New models from each provider are generally supported automatically — update the relevant env var (MAIN_MODEL, RCA_MODEL) or select chat models in the UI.

Provider Setup

The easiest way to get started. One API key gives you access to models from OpenAI, Anthropic, Google, Meta, and more.

OPENROUTER_API_KEY=sk-or-v1-...
LLM_PROVIDER_MODE=openrouter

OpenAI

OPENAI_API_KEY=sk-...
LLM_PROVIDER_MODE=direct

Anthropic

ANTHROPIC_API_KEY=sk-ant-...
LLM_PROVIDER_MODE=direct

Google AI (Gemini via API Key)

For using Gemini models with a Google AI Studio API key.

GOOGLE_AI_API_KEY=AIza...
LLM_PROVIDER_MODE=direct

Vertex AI (Gemini via Google Cloud)

For organizations using Google Cloud. Vertex AI provides enterprise-grade access to Gemini models with IAM-based authentication.

Requirements:

  • A Google Cloud project with Vertex AI API enabled
  • A service account with the Vertex AI User role

Setup:

# Required: Google Cloud project ID
VERTEX_AI_PROJECT=my-gcp-project

# Required: Service account credentials (JSON string)
VERTEX_AI_SERVICE_ACCOUNT_JSON={"type":"service_account","project_id":"...","private_key":"..."}

# Optional: Location (default: global)
VERTEX_AI_LOCATION=global

LLM_PROVIDER_MODE=direct

Authentication options:

  1. Service account JSON (recommended): Set VERTEX_AI_SERVICE_ACCOUNT_JSON to the full JSON contents of your service account key file.
  2. Application Default Credentials (ADC): If running on GCP (Cloud Run, GKE), ADC is automatic — just set VERTEX_AI_PROJECT.
  3. Credentials file path: Set GOOGLE_APPLICATION_CREDENTIALS to the path of your service account key file.
tip

The project ID is automatically extracted from the service account JSON if VERTEX_AI_PROJECT is not set.

Ollama (Local Models)

Run models locally on your own hardware with Ollama. No API key needed.

Setup:

  1. Install Ollama on your host machine
  2. Pull the models you want to use:
    ollama pull llama3.1
    ollama pull qwen2.5:32b
  3. Configure Aurora:
    OLLAMA_BASE_URL=http://host.docker.internal:11434
    LLM_PROVIDER_MODE=direct
Docker networking

host.docker.internal allows Docker containers to reach services running on the host machine. This works out of the box on macOS and Windows. On Linux, Aurora's Docker Compose files include the extra_hosts configuration needed for this to work.

Recommended models for RCA:

ModelSizeNotes
llama3.1:70b70BBest quality for complex RCA
qwen2.5:32b32BGood balance of quality and speed
llama3.23BFast, but limited tool calling

AWS Bedrock

Use AWS Bedrock for Claude (and other Bedrock models) either through an OpenAI-compatible gateway or directly via the AWS SDK. Aurora picks the mode automatically: if BEDROCK_BASE_URL is set it uses gateway mode, otherwise it uses native mode.

Bedrock is configured by an admin via environment variables. There are two ways to route models to it:

  • LLM_PROVIDER_MODE=bedrock (recommended for native mode) — every model picked in the app (e.g. "Claude Opus 4.7") routes through Bedrock automatically, translated to the matching inference-profile id (region-aware). No bedrock/ prefix and no per-model configuration needed; the picker shows clean model names.
  • LLM_PROVIDER_MODE=direct + explicit bedrock/<id> model ids (e.g. MAIN_MODEL=bedrock/us.anthropic.claude-sonnet-4-5-v1:0) — pin specific Bedrock ids per model. Use this for gateway mode, where the model name is whatever your gateway expects.

Gateway mode (OpenAI-compatible endpoint)

For an endpoint that exposes an OpenAI-compatible API (POST .../v1/chat/completions) in front of Bedrock — for example an AWS Bedrock Access Gateway running inside your VPC. No AWS credentials are needed in Aurora; the gateway (and your network boundary) handles auth.

# OpenAI-compatible base URL (Aurora appends /chat/completions)
BEDROCK_BASE_URL=https://bedrock-gateway.internal.example.com/v1

# Optional — only if your gateway requires a key. Often unset (the VPC boundary handles auth).
BEDROCK_API_KEY=

LLM_PROVIDER_MODE=direct
MAIN_MODEL=bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0 # the model name your gateway expects

Native mode (AWS SDK)

For talking to AWS Bedrock directly. Requires a region plus AWS credentials (or an IAM role). On-demand Claude models on Bedrock require an inference-profile id (region-prefixed, e.g. us.anthropic.claude-sonnet-4-5-v1:0), not the bare model id.

# Required: AWS region (falls back to AWS_REGION / AWS_DEFAULT_REGION)
BEDROCK_REGION=us-east-1

# Credentials — omit these to use an IAM role or the default AWS credential chain.
# BEDROCK_* takes precedence over the standard AWS_* variables.
BEDROCK_ACCESS_KEY_ID=AKIA...
BEDROCK_SECRET_ACCESS_KEY=...
# Required only with temporary / STS credentials (e.g. an assumed role):
# BEDROCK_SESSION_TOKEN=...
# Or use a named profile instead of explicit keys:
# BEDROCK_PROFILE=my-bedrock-profile

# Recommended: route every model pick through Bedrock with clean model names.
LLM_PROVIDER_MODE=bedrock
MAIN_MODEL=anthropic/claude-sonnet-4.6 # auto-translated to us.anthropic.claude-sonnet-4-6

Requirements (native mode):

  • A Bedrock-enabled AWS account with access to the chosen model granted in the Bedrock console.
  • An identity (IAM user/role) with bedrock:InvokeModel / bedrock:InvokeModelWithResponseStream permissions.
tip

Gateway and native are the same bedrock provider — set BEDROCK_BASE_URL for gateway mode, leave it unset for native. Each mode's recommended LLM_PROVIDER_MODE and model-id style is shown above.

RCA Model Configuration

Background RCA uses the single-agent path by default (ORCHESTRATOR_ENABLED=false), configured via RCA_MODEL. An opt-in multi-agent orchestrator is also available — see Multi-agent orchestrator below.

Single-agent RCA (default)

By default, Aurora uses anthropic/claude-haiku-4.5 for background Root Cause Analysis. You can change this to any supported provider/model.

# Format: provider/model-name
RCA_MODEL=anthropic/claude-haiku-4.5

Examples:

# Anthropic (default)
RCA_MODEL=anthropic/claude-haiku-4.5

# OpenAI
RCA_MODEL=openai/gpt-4o

# Google AI
RCA_MODEL=google/gemini-2.5-flash

# Vertex AI
RCA_MODEL=vertex/gemini-2.5-flash

# Ollama (local)
RCA_MODEL=ollama/llama3.1

# AWS Bedrock (native — inference-profile id)
RCA_MODEL=bedrock/us.anthropic.claude-haiku-4-5-v1:0

When RCA_MODEL is not set, the default depends on RCA_OPTIMIZE_COSTS:

  • RCA_OPTIMIZE_COSTS=true (default): Uses anthropic/claude-haiku-4.5
  • RCA_OPTIMIZE_COSTS=false: Uses anthropic/claude-opus-4.6

Multi-agent orchestrator

Opt-in via ORCHESTRATOR_ENABLED=true. A lead orchestrator triages each incident and may fan out parallel read-only sub-agents. When enabled, RCA_MODEL is bypassed and two additional models are required:

ORCHESTRATOR_ENABLED=true
RCA_ORCHESTRATOR_MODEL=anthropic/claude-opus-4.7 # * triage + synthesis
RCA_SUBAGENT_MODEL=anthropic/claude-sonnet-4.6 # * sub-agent investigators

The split exists because triage/synthesis needs reliable structured-output JSON while sub-agents need reliable tool-calling. Per-role overrides are supported — set model: in the frontmatter of server/chat/backend/agent/orchestrator/roles/*.md.

Cost Considerations

LLM costs depend on:

  • Tokens processed: Longer investigations use more tokens
  • Model choice: Larger models cost more per token
  • Frequency: More investigations = higher costs

Cost Optimization

  1. Set RCA_OPTIMIZE_COSTS=true to use cheaper models for background RCA
  2. Use OpenRouter for flexible, pay-per-token pricing
  3. Use Ollama for zero API costs (requires local GPU)

Safety Guardrail Model

Aurora can run an LLM-based command safety judge before executing commands, catching novel dangerous behavior that deterministic rules cannot anticipate.

# Format: provider/model-name (same as MAIN_MODEL / RCA_MODEL)
GUARDRAILS_LLM_MODEL=openai/gpt-4o-mini

The safety judge model can be any provider supported above. A fast, cheap, non-reasoning model is recommended since this runs on every command execution — reasoning models waste tokens on chain-of-thought for a simple Yes/No classification. See Command Safety Configuration for full setup details.

Troubleshooting

"Invalid API key"

  • Check key is correctly copied (no extra spaces)
  • Verify key is active in provider dashboard
  • Ensure correct environment variable name

"Rate limit exceeded"

  • Wait and retry
  • Consider upgrading your API tier
  • Reduce concurrent investigations

"Model not available"

  • Check provider status page
  • Try a different model
  • Ensure your API key has access to the model

Vertex AI: "DefaultCredentialsError"

  • Verify VERTEX_AI_PROJECT is set
  • Check that VERTEX_AI_SERVICE_ACCOUNT_JSON contains valid JSON
  • Ensure the service account has the Vertex AI User role

Ollama: "Provider not available"

  • Verify Ollama is running: curl http://localhost:11434/api/tags
  • Check that the model is pulled: ollama list
  • Ensure OLLAMA_BASE_URL is correct (use host.docker.internal in Docker)

Bedrock: "on-demand throughput isn't supported"

  • Native mode requires an inference-profile id, not the bare model id. Use a region-prefixed id such as bedrock/us.anthropic.claude-sonnet-4-5-v1:0 (us./eu./apac.).

Bedrock: "Unable to locate credentials" / "NoRegionError"

  • Native mode needs a region — set BEDROCK_REGION (or AWS_REGION / AWS_DEFAULT_REGION).
  • Provide credentials via BEDROCK_ACCESS_KEY_ID + BEDROCK_SECRET_ACCESS_KEY, a BEDROCK_PROFILE, or an IAM role / the default AWS credential chain.
  • Confirm the identity has bedrock:InvokeModel permissions and that model access is granted in the Bedrock console.

Bedrock: "AccessDeniedException" calling the gateway

  • In gateway mode, set BEDROCK_BASE_URL to the OpenAI base path (ending in /v1); Aurora appends /chat/completions.
  • If your gateway requires a key, set BEDROCK_API_KEY. If it doesn't, leave it unset.