These docs track the main branch and may describe unreleased features. The stable documentation lives at docs.docker.com.

OVHcloud

Use OVHcloud AI Endpoints models with docker-agent.

Overview

OVHcloud AI Endpoints serves open-weight models through an OpenAI-compatible API, hosted in the EU. docker-agent includes built-in support for OVHcloud as an alias provider.

Setup

  1. Create an access token from the OVHcloud AI Endpoints portal.
  2. Set the environment variable:

    export OVH_AI_ENDPOINTS_ACCESS_TOKEN=your-access-token
    

Usage

Inline Syntax

agents:
  root:
    model: ovhcloud/Qwen3.5-397B-A17B
    description: Assistant using OVHcloud
    instruction: You are a helpful assistant.

Named Model

models:
  ovhcloud_model:
    provider: ovhcloud
    model: Qwen3.5-397B-A17B
    temperature: 0.7
    max_tokens: 8192

agents:
  root:
    model: ovhcloud_model
    description: Assistant using OVHcloud
    instruction: You are a helpful assistant.

Available Models

OVHcloud hosts a rotating catalogue of open-weight models. Check the AI Endpoints catalogue for current model IDs, context limits, and free-tier availability.

Model Description
Qwen3.5-397B-A17B Large Qwen3.5 MoE — strong general, coding, and reasoning
Qwen3-32B Mid-size Qwen3 — fast, tool-calling, reasoning
Qwen3.6-27B Compact Qwen3.6 — fast and efficient
Qwen3.5-9B Small Qwen3.5 — lightweight, free-tier friendly
Qwen3-Coder-30B-A3B-Instruct Qwen3 Coder MoE — optimised for code generation
Meta-Llama-3_3-70B-Instruct Llama 3.3 70B — reliable general-purpose chat
Mistral-Small-3.2-24B-Instruct-2506 Compact, fast, tool-calling

Model IDs are case-sensitive and must be passed exactly as the catalogue lists them.

How It Works

OVHcloud is implemented as a built-in alias in docker-agent:

docker-agent automatically coalesces consecutive system messages into one for OVHcloud, because some OVHcloud models return an empty stream when a request carries more than one system message.

Free tier

OVHcloud offers rate-limited free access to several models. Under heavy rate-limiting the endpoint may return an empty response; docker-agent surfaces this as a warning rather than failing. For sustained use, an access token with a paid plan avoids the free-tier request-rate cap.

Example: Code Assistant

agents:
  coder:
    model: ovhcloud/Qwen3.5-397B-A17B
    description: Code assistant using Qwen3.5
    instruction: |
      You are an expert programmer.
      Write clean, well-documented code and follow language best practices.
    toolsets:
      - type: filesystem
      - type: shell
      - type: think