llm runtime_error ai_generated true

KeyError: 'content' — streaming response chunk missing 'content' field in assistant message delta

ID: llm/streaming-assistant-content-empty

Also available as: JSON · Markdown · 中文
90%Fix Rate
87%Confidence
1Evidence
2024-03-22First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
openai>=1.0.0 active
gpt-4-turbo-2024-04-09 active
gpt-4o-2024-05-13 active

Root Cause

When streaming responses that include tool calls, some chunks may contain only tool_calls without content; code that expects 'content' in every delta raises KeyError.

generic

中文

当流式响应包含工具调用时,某些块可能仅包含 tool_calls 而没有 content;期望每个 delta 中都有 'content' 的代码会引发 KeyError。

Official Documentation

https://platform.openai.com/docs/guides/streaming-responses/streaming-tool-calls

Workarounds

  1. 95% success Check for both 'content' and 'tool_calls' in each delta: if 'tool_calls' is present, handle them separately; if 'content' is absent, initialize an empty string.
    Check for both 'content' and 'tool_calls' in each delta: if 'tool_calls' is present, handle them separately; if 'content' is absent, initialize an empty string.
  2. 90% success Use the OpenAI SDK's built-in streaming helper that automatically parses deltas into complete messages, such as client.beta.chat.completions.stream()
    Use the OpenAI SDK's built-in streaming helper that automatically parses deltas into complete messages, such as client.beta.chat.completions.stream()
  3. 85% success Implement a robust delta accumulator that tracks both content and tool_calls across chunks, assembling them into a complete message before processing.
    Implement a robust delta accumulator that tracks both content and tool_calls across chunks, assembling them into a complete message before processing.

中文步骤

  1. 在每个 delta 中检查 'content' 和 'tool_calls':如果存在 'tool_calls',则单独处理;如果缺少 'content',则初始化为空字符串。
  2. 使用 OpenAI SDK 内置的流式处理辅助函数,自动将 delta 解析为完整消息,例如 client.beta.chat.completions.stream()
  3. 实现一个健壮的 delta 累加器,跨块跟踪 content 和 tool_calls,在处理前组装成完整消息。

Dead Ends

Common approaches that don't work:

  1. 60% fail

    This masks the error but does not handle tool_calls; the application may miss tool call data entirely

  2. 80% fail

    Skipping chunks may lose tool call data or other important deltas, corrupting the response

  3. 40% fail

    Disabling streaming eliminates the error but increases latency and defeats the purpose of streaming for user experience