Model Routing
Route requests to different models based on the content of user messages.
Overview
Model routing lets you define a “router” model that automatically selects the best underlying model based on the user’s message. This is useful for cost optimization, specialized handling, or load balancing across models.
cagent uses NLP-based text similarity (via Bleve full-text search) to match user messages against example phrases you define. The route with the best-matching examples wins, and that model handles the request.
Configuration
Add routing rules to any model definition. The model’s provider/model fields become the fallback when no route matches:
models:
smart_router:
# Fallback model when no routing rule matches
provider: openai
model: gpt-4o-mini
# Routing rules
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "Write a detailed technical document"
- "Help me architect this system"
- "Review this code for security issues"
- "Explain this complex algorithm"
- model: openai/gpt-4o
examples:
- "Generate some creative ideas"
- "Write a story about"
- "Help me brainstorm"
- "Come up with names for"
- model: openai/gpt-4o-mini
examples:
- "What time is it"
- "Convert this to JSON"
- "Simple math calculation"
- "Translate this word"
agents:
root:
model: smart_router
description: Assistant with intelligent model routing
instruction: You are a helpful assistant.
Routing Rules
Each routing rule has:
| Field | Type | Required | Description |
|---|---|---|---|
model |
string | ✓ | Target model (inline format or reference to models section) |
examples |
array | ✓ | Example phrases that should route to this model |
Matching Behavior
The router:
- Extracts the last user message from the conversation
- Searches all examples using full-text search
- Aggregates match scores by route (best score per route wins)
- Selects the route with the highest overall score
- Falls back to the base model if no good match is found
Use Cases
Cost Optimization
Route simple queries to cheaper models:
models:
cost_optimizer:
provider: openai
model: gpt-4o-mini # Cheap fallback
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "Complex analysis"
- "Detailed research"
- "Multi-step reasoning"
Specialized Models
Route coding tasks to code-specialized models:
models:
task_router:
provider: openai
model: gpt-4o # General fallback
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "Write code"
- "Debug this function"
- "Review my implementation"
- "Fix this bug"
- model: openai/gpt-4o
examples:
- "Write a blog post"
- "Help me with writing"
- "Summarize this document"
Load Balancing
Distribute load across equivalent models from different providers:
models:
load_balancer:
provider: openai
model: gpt-4o
routing:
- model: anthropic/claude-sonnet-4-0
examples:
- "First request pattern"
- "Another request type"
- model: google/gemini-2.5-flash
examples:
- "Different request pattern"
- "Alternative query style"
Debugging
Enable debug logging to see routing decisions:
cagent run config.yaml --debug
Look for log entries like:
"Rule-based router selected model" router=smart_router selected_model=anthropic/claude-sonnet-4-0
"Route matched" model=anthropic/claude-sonnet-4-0 score=2.45