OpenAI
Use GPT-4o, GPT-5, GPT-5-mini, and other OpenAI models with docker-agent.
Setup
# Set your API key
export OPENAI_API_KEY="sk-..."
Configuration
Inline
agents:
root:
model: openai/gpt-5
Named Model
models:
gpt:
provider: openai
model: gpt-5
temperature: 0.7
max_tokens: 4000
Available Models
| Model | Best For |
|---|---|
gpt-5 |
Most capable, complex reasoning |
gpt-5-mini |
Fast, cost-effective, good reasoning |
gpt-4o |
Multimodal, balanced performance |
gpt-4o-mini |
Cheapest, fast for simple tasks |
Find more model names at modelnames.ai or in the official OpenAI docs.
Thinking Budget
OpenAI reasoning models (o-series, gpt-5, gpt-5-mini) support extended thinking through the reasoning_effort API parameter. Set thinking_budget to control the effort level:
models:
gpt-thinker:
provider: openai
model: gpt-5-mini
thinking_budget: high # minimal | low | medium | high | xhigh
Effort levels:
| Level | Description |
|---|---|
none |
Don’t request extra reasoning (alias for 0); the API’s own default still applies. |
minimal |
Fastest; lightest reasoning pass. |
low |
Quick reasoning for straightforward tasks. |
medium |
Balanced default. |
high |
More thorough; recommended for complex tasks. |
xhigh |
Near-maximum effort; slower but most accurate. |
These are the only values OpenAI accepts — token counts, max, adaptive, and adaptive/<effort> are rejected with a configuration error at request time. Older models (o1, o3-mini) only accept low/medium/high.
OpenAI reasoning models always produce hidden reasoning tokens that count against max_tokens — even with thinking_budget: none. docker-agent automatically raises the output-token floor for its internal low-effort calls so reasoning cannot starve visible text output.
See the Thinking / Reasoning guide for a cross-provider overview.
Use base_url for proxies and OpenAI-compatible services. See Custom Providers for full setup.
Custom Endpoint
Use base_url to connect to OpenAI-compatible APIs:
models:
custom:
provider: openai
model: gpt-5-mini
base_url: https://your-proxy.example.com/v1
WebSocket Transport
For OpenAI Responses API models (gpt-4.1+, o-series, gpt-5), you can use WebSocket streaming instead of the default SSE (Server-Sent Events):
models:
fast-gpt:
provider: openai
model: gpt-4.1
provider_opts:
transport: websocket # Use WebSocket instead of SSE
Benefits
- ~40% faster for workflows with 20+ tool calls
- Persistent connection reduces per-turn overhead
- Server-side caching of connection state
- Automatic fallback to SSE if WebSocket fails
Requirements
- Only works with Responses API models:
gpt-4.1+,o1,o3,o4,gpt-5 - NOT compatible with the
--models-gatewayflag (automatically falls back to SSE when a gateway is configured) - Requires
OPENAI_API_KEYenvironment variable
Example
See examples/websocket_transport.yaml for a complete example.