AWS Bedrock

Access Claude, Nova, Llama, and more through AWS infrastructure with enterprise-grade security and compliance.

Prerequisites

Configuration

models:
  bedrock-claude:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    max_tokens: 64000
    provider_opts:
      region: us-east-1

Authentication

Option 1: Bedrock API Key (Simplest)

export AWS_BEARER_TOKEN_BEDROCK="your-key"
models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    token_key: AWS_BEARER_TOKEN_BEDROCK # env var name
    provider_opts:
      region: us-east-1

Option 2: AWS Credentials (Default)

Uses the standard AWS SDK credential chain: env vars → shared credentials → config → IAM roles.

models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    provider_opts:
      profile: my-aws-profile
      region: us-east-1

With IAM Role Assumption

models:
  bedrock:
    provider: amazon-bedrock
    model: anthropic.claude-3-sonnet-20240229-v1:0
    provider_opts:
      role_arn: "arn:aws:iam::123456789012:role/BedrockAccessRole"
      external_id: "my-external-id"

Provider Options

Option Type Default Description
region string us-east-1 AWS region
profile string AWS profile name
role_arn string IAM role ARN for assume role
role_session_name string cagent-bedrock-session Session name for assumed role
external_id string External ID for role assumption
endpoint_url string Custom endpoint (VPC/testing)
interleaved_thinking bool true Reasoning during tool calls (Claude)
disable_prompt_caching bool false Disable automatic prompt caching

Inference Profiles

Use inference profile prefixes for optimal routing:

Prefix Routes To
global. All commercial AWS regions (recommended)
us. US regions only
eu. EU regions only (GDPR compliance)
💡 Inference profiles

Use global. prefix on model IDs for automatic cross-region routing. Use eu. prefix for GDPR compliance.

Prompt Caching

Automatically enabled for supported models to reduce latency and costs. System prompts, tool definitions, and recent messages are cached with a 5-minute TTL.

# Disable if needed
provider_opts:
  disable_prompt_caching: true