Skip to main content

LLM Providers

Aurora requires an LLM provider for its AI-powered investigation and Root Cause Analysis (RCA) capabilities. You can use a single gateway like OpenRouter, connect directly to individual providers, or run models locally with Ollama.

Supported Providers

ProviderModeEnvironment VariableGet API Key
OpenRouterGatewayOPENROUTER_API_KEYopenrouter.ai/keys
OpenAIDirectOPENAI_API_KEYplatform.openai.com
AnthropicDirectANTHROPIC_API_KEYconsole.anthropic.com
Google AIDirectGOOGLE_AI_API_KEYai.google.dev
Vertex AIDirectVERTEX_AI_PROJECT + credentialsconsole.cloud.google.com
OllamaDirectOLLAMA_BASE_URLollama.com (free, local)

Only one provider is required.

Provider Modes

Aurora supports two routing modes, controlled by LLM_PROVIDER_MODE:

OpenRouter Mode (default)

Routes all LLM requests through OpenRouter, giving you access to multiple model providers with a single API key.

LLM_PROVIDER_MODE=openrouter
OPENROUTER_API_KEY=sk-or-v1-...

Direct Mode

Connects directly to each provider's native API. Use this when running models locally with Ollama or when you prefer direct API access.

LLM_PROVIDER_MODE=direct

In direct mode, Aurora auto-detects the provider from the model name prefix (e.g., anthropic/claude-3-haiku routes to Anthropic, google/gemini-2.5-flash routes to Google AI).

note

Vertex AI and Ollama always use their native SDKs regardless of LLM_PROVIDER_MODE. OpenAI, Anthropic, and Google AI models can all be routed through OpenRouter.

Supported Models

ProviderModelNotes
OpenAIopenai/gpt-5.4Latest flagship, 1M context
openai/gpt-5.2Previous flagship
openai/o3Strong reasoning model
openai/o4-miniFast reasoning, lower cost
openai/o3-miniCompact reasoning model
openai/gpt-4.1Reliable all-rounder
openai/gpt-4.1-miniFast and affordable
openai/gpt-4oMultimodal (text + vision)
openai/gpt-4o-miniCheapest OpenAI option
Anthropicanthropic/claude-opus-4.6Most capable, 1M context
anthropic/claude-sonnet-4.6Near Opus quality at lower cost
anthropic/claude-opus-4.5Previous generation flagship
anthropic/claude-sonnet-4.5Balanced quality and speed
anthropic/claude-haiku-4.5Fast, affordable
anthropic/claude-3.5-sonnetWidely used, reliable
anthropic/claude-3-haikuCheapest (default RCA model)
Google Geminigoogle/gemini-3.1-pro-previewLatest flagship with thinking
google/gemini-3-flash-previewFast, outperforms 2.5 Pro
google/gemini-2.5-proStrong for complex tasks
google/gemini-2.5-flashCost-effective
google/gemini-2.5-flash-liteCheapest Gemini option
Vertex AIvertex/gemini-3.1-pro-previewLatest flagship with thinking
vertex/gemini-3-flash-previewFast, enterprise-grade
vertex/gemini-2.5-proStrong for complex tasks
vertex/gemini-2.5-flashCost-effective with IAM auth
vertex/gemini-2.5-flash-liteCheapest Vertex option
Ollamaollama/llama3.1Meta's Llama 3.1 (8B/70B)
ollama/qwen2.5Alibaba's Qwen 2.5 (various sizes)
Any model via ollama pull

Model names use the provider/model format. New models from each provider are generally supported automatically — update your RCA_MODEL or select them in the UI.

Provider Setup

The easiest way to get started. One API key gives you access to models from OpenAI, Anthropic, Google, Meta, and more.

OPENROUTER_API_KEY=sk-or-v1-...
LLM_PROVIDER_MODE=openrouter

OpenAI

OPENAI_API_KEY=sk-...
LLM_PROVIDER_MODE=direct

Anthropic

ANTHROPIC_API_KEY=sk-ant-...
LLM_PROVIDER_MODE=direct

Google AI (Gemini via API Key)

For using Gemini models with a Google AI Studio API key.

GOOGLE_AI_API_KEY=AIza...
LLM_PROVIDER_MODE=direct

Vertex AI (Gemini via Google Cloud)

For organizations using Google Cloud. Vertex AI provides enterprise-grade access to Gemini models with IAM-based authentication.

Requirements:

  • A Google Cloud project with Vertex AI API enabled
  • A service account with the Vertex AI User role

Setup:

# Required: Google Cloud project ID
VERTEX_AI_PROJECT=my-gcp-project

# Required: Service account credentials (JSON string)
VERTEX_AI_SERVICE_ACCOUNT_JSON={"type":"service_account","project_id":"...","private_key":"..."}

# Optional: Location (default: global)
VERTEX_AI_LOCATION=global

LLM_PROVIDER_MODE=direct

Authentication options:

  1. Service account JSON (recommended): Set VERTEX_AI_SERVICE_ACCOUNT_JSON to the full JSON contents of your service account key file.
  2. Application Default Credentials (ADC): If running on GCP (Cloud Run, GKE), ADC is automatic — just set VERTEX_AI_PROJECT.
  3. Credentials file path: Set GOOGLE_APPLICATION_CREDENTIALS to the path of your service account key file.
tip

The project ID is automatically extracted from the service account JSON if VERTEX_AI_PROJECT is not set.

Ollama (Local Models)

Run models locally on your own hardware with Ollama. No API key needed.

Setup:

  1. Install Ollama on your host machine
  2. Pull the models you want to use:
    ollama pull llama3.1
    ollama pull qwen2.5:32b
  3. Configure Aurora:
    OLLAMA_BASE_URL=http://host.docker.internal:11434
    LLM_PROVIDER_MODE=direct
Docker networking

host.docker.internal allows Docker containers to reach services running on the host machine. This works out of the box on macOS and Windows. On Linux, Aurora's Docker Compose files include the extra_hosts configuration needed for this to work.

Recommended models for RCA:

ModelSizeNotes
llama3.1:70b70BBest quality for complex RCA
qwen2.5:32b32BGood balance of quality and speed
llama3.23BFast, but limited tool calling

RCA Model Configuration

By default, Aurora uses anthropic/claude-3-haiku for background Root Cause Analysis. You can change this to any supported provider/model.

# Format: provider/model-name
RCA_MODEL=anthropic/claude-3-haiku

Examples:

# Anthropic (default)
RCA_MODEL=anthropic/claude-3-haiku

# OpenAI
RCA_MODEL=openai/gpt-4o

# Google AI
RCA_MODEL=google/gemini-2.5-flash

# Vertex AI
RCA_MODEL=vertex/gemini-2.5-flash

# Ollama (local)
RCA_MODEL=ollama/llama3.1

When RCA_MODEL is not set, the default depends on RCA_OPTIMIZE_COSTS:

  • RCA_OPTIMIZE_COSTS=true (default): Uses anthropic/claude-3-haiku
  • RCA_OPTIMIZE_COSTS=false: Uses anthropic/claude-opus-4.5

Cost Considerations

LLM costs depend on:

  • Tokens processed: Longer investigations use more tokens
  • Model choice: Larger models cost more per token
  • Frequency: More investigations = higher costs

Cost Optimization

  1. Set RCA_OPTIMIZE_COSTS=true to use cheaper models for background RCA
  2. Use OpenRouter for flexible, pay-per-token pricing
  3. Use Ollama for zero API costs (requires local GPU)

Troubleshooting

"Invalid API key"

  • Check key is correctly copied (no extra spaces)
  • Verify key is active in provider dashboard
  • Ensure correct environment variable name

"Rate limit exceeded"

  • Wait and retry
  • Consider upgrading your API tier
  • Reduce concurrent investigations

"Model not available"

  • Check provider status page
  • Try a different model
  • Ensure your API key has access to the model

Vertex AI: "DefaultCredentialsError"

  • Verify VERTEX_AI_PROJECT is set
  • Check that VERTEX_AI_SERVICE_ACCOUNT_JSON contains valid JSON
  • Ensure the service account has the Vertex AI User role

Ollama: "Provider not available"

  • Verify Ollama is running: curl http://localhost:11434/api/tags
  • Check that the model is pulled: ollama list
  • Ensure OLLAMA_BASE_URL is correct (use host.docker.internal in Docker)