Skip to content

Stream silently produces empty response when model outputs non-standard delta.reasoning SSE field (Qwen3 thinking mode) #3145

Description

@philippart-s

Description

When using a Qwen3-family model that streams thinking tokens via a non-standard delta.reasoning SSE field, Docker Agent's stream processor terminates in < 1ms and produces an empty response (tool_calls=0, content_length=0, stopped=true). The agent never executes any tool calls and returns no output.

Expected Behavior

The agent calls list_directory and returns the result.

Actual Behavior

The stream completes in < 1ms with no output:

level=DEBUG msg="OpenAI chat completion stream created successfully" model=Qwen3.5-397B-A17B
level=DEBUG msg="Processing stream" agent=root model=ovhcloud/Qwen3.5-397B-A17B
level=DEBUG msg="Stream processed" agent=root tool_calls=0 content_length=0 stopped=true
level=DEBUG msg="Skipping empty assistant message (no content and no tool calls)" agent=root
level=DEBUG msg="Conversation stopped" agent=root

Steps to Reproduce

Detailed steps to reproduce the bug:

  1. create a reproducer.yml file:
providers:
  ovhcloud:
    api_type: openai_chatcompletions
    base_url: https://oai.endpoints.kepler.ai.cloud.ovh.net/v1
    token_key: OVH_AI_ENDPOINTS_ACCESS_TOKEN

models:
  ovh-qwen:
    provider: ovhcloud
    model: Qwen3.5-397B-A17B
    temperature: 0.6

agents:
  root:
    model: ovh-qwen
    instruction: |
      You are a helpful assistant.
    toolsets:
      - type: filesystem

ℹ️ This is reproducible without an API key, OVHcloud AI Endpoints offers free access to this model at 2 req/s. ℹ️

  1. run docker agent run reproducer.yml "List the files in the current directory"

Environment

  • Docker Agent version: v1.79.0
  • OS: macOS Darwin 25.5.0
  • Provider: api_type: openai_chatcompletions (OVHcloud AI Endpoints)
  • Model: Qwen3.5-397B-A17B

Error Output

level=DEBUG msg="OpenAI chat completion stream created successfully" model=Qwen3.5-397B-A17B
level=DEBUG msg="Processing stream" agent=root model=ovhcloud/Qwen3.5-397B-A17B
level=DEBUG msg="Stream processed" agent=root tool_calls=0 content_length=0 stopped=true
level=DEBUG msg="Skipping empty assistant message (no content and no tool calls)" agent=root
level=DEBUG msg="Conversation stopped" agent=root

Screenshots

N/A

Additional Context

Qwen3-family models in thinking mode stream reasoning tokens via a non-standard delta.reasoning field before producing any content or tool calls:

{"choices":[{"index":0,"delta":{"role":"assistant","reasoning":"The user wants...","name":"assistant"}}]}

Confirmed via curl: the model correctly outputs standard delta.tool_calls eventually, but only after all delta.reasoning chunks are done. Docker Agent's stream processor appears to fail silently on the unknown delta.reasoning field, terminating the stream before any tool calls or content is produced.

The standard delta.tool_calls format produced by the model is correct:
{"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"chatcmpl-tool-xxx","type":"function","function":{"name":"list_directory","arguments":""}}]}}]}

Metadata

Metadata

Assignees

Labels

area/agentFor work that has to do with the general agent loop/agentic features of the apparea/providersFor features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.)

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions