Model Routing

Route requests to different models based on the content of user messages.

Overview

Model routing lets you define a “router” model that automatically selects the best underlying model based on the user’s message. This is useful for cost optimization, specialized handling, or load balancing across models.

ℹ️ How It Works

cagent uses NLP-based text similarity (via Bleve full-text search) to match user messages against example phrases you define. The route with the best-matching examples wins, and that model handles the request.

Configuration

Add routing rules to any model definition. The model’s provider/model fields become the fallback when no route matches:

models:
  smart_router:
    # Fallback model when no routing rule matches
    provider: openai
    model: gpt-4o-mini

    # Routing rules
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Write a detailed technical document"
          - "Help me architect this system"
          - "Review this code for security issues"
          - "Explain this complex algorithm"

      - model: openai/gpt-4o
        examples:
          - "Generate some creative ideas"
          - "Write a story about"
          - "Help me brainstorm"
          - "Come up with names for"

      - model: openai/gpt-4o-mini
        examples:
          - "What time is it"
          - "Convert this to JSON"
          - "Simple math calculation"
          - "Translate this word"

agents:
  root:
    model: smart_router
    description: Assistant with intelligent model routing
    instruction: You are a helpful assistant.

Routing Rules

Each routing rule has:

Field	Type	Required	Description
`model`	string	✓	Target model (inline format or reference to `models` section)
`examples`	array	✓	Example phrases that should route to this model

Matching Behavior

The router:

Extracts the last user message from the conversation
Searches all examples using full-text search
Aggregates match scores by route (best score per route wins)
Selects the route with the highest overall score
Falls back to the base model if no good match is found

💡 Writing Good Examples

- Use diverse phrasing that captures the intent - Include keywords users actually use - Add 5-10 examples per route for best results - Examples don't need to be exact matches — the router uses semantic similarity

Use Cases

Cost Optimization

Route simple queries to cheaper models:

models:
  cost_optimizer:
    provider: openai
    model: gpt-4o-mini # Cheap fallback
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Complex analysis"
          - "Detailed research"
          - "Multi-step reasoning"

Specialized Models

Route coding tasks to code-specialized models:

models:
  task_router:
    provider: openai
    model: gpt-4o # General fallback
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "Write code"
          - "Debug this function"
          - "Review my implementation"
          - "Fix this bug"
      - model: openai/gpt-4o
        examples:
          - "Write a blog post"
          - "Help me with writing"
          - "Summarize this document"

Load Balancing

Distribute load across equivalent models from different providers:

models:
  load_balancer:
    provider: openai
    model: gpt-4o
    routing:
      - model: anthropic/claude-sonnet-4-0
        examples:
          - "First request pattern"
          - "Another request type"
      - model: google/gemini-2.5-flash
        examples:
          - "Different request pattern"
          - "Alternative query style"

Debugging

Enable debug logging to see routing decisions:

cagent run config.yaml --debug

Look for log entries like:

"Rule-based router selected model" router=smart_router selected_model=anthropic/claude-sonnet-4-0
"Route matched" model=anthropic/claude-sonnet-4-0 score=2.45

⚠️ Limitations

- Routing only considers the last user message, not full conversation context - Very short messages may not match well — consider your fallback carefully - Each routed model creates a separate provider connection

← Previous Structured Output Next → LSP Tool