llm data_error ai_generated true

InvalidRequestError: function_call arguments must be valid JSON — streaming mode detected malformed JSON

ID: llm/function-call-json-schema-violation-in-streaming

Also available as: JSON · Markdown · 中文
85%Fix Rate
88%Confidence
1Evidence
2024-02-10First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
openai==1.14.0 active
anthropic==0.28.0 active
gpt-4-0613 active
claude-3-sonnet-20240229 active

Root Cause

When streaming function calls, the LLM may emit incomplete or malformed JSON in intermediate chunks, causing the API to reject the request if validation is strict.

generic

中文

当流式传输函数调用时,LLM可能会在中间数据块中发出不完整或格式错误的JSON,如果验证严格,会导致API拒绝请求。

Official Documentation

https://platform.openai.com/docs/guides/function-calling

Workarounds

  1. 90% success Accumulate streaming chunks and parse JSON only after receiving the final chunk (e.g., `function_call_chunks = []; for chunk in stream: if chunk.choices[0].delta.function_call: function_call_chunks.append(chunk.choices[0].delta.function_call.arguments); full_json = ''.join(function_call_chunks); args = json.loads(full_json)`).
    Accumulate streaming chunks and parse JSON only after receiving the final chunk (e.g., `function_call_chunks = []; for chunk in stream: if chunk.choices[0].delta.function_call: function_call_chunks.append(chunk.choices[0].delta.function_call.arguments); full_json = ''.join(function_call_chunks); args = json.loads(full_json)`).
  2. 80% success Use a JSON repair library like `json-repair` to fix malformed JSON from streaming chunks before validation.
    Use a JSON repair library like `json-repair` to fix malformed JSON from streaming chunks before validation.

中文步骤

  1. 累积流式数据块,仅在接收完最后一个数据块后解析JSON(例如:`function_call_chunks = []; for chunk in stream: if chunk.choices[0].delta.function_call: function_call_chunks.append(chunk.choices[0].delta.function_call.arguments); full_json = ''.join(function_call_chunks); args = json.loads(full_json)`)。
  2. 使用像`json-repair`这样的JSON修复库,在验证前修复来自流式数据块的格式错误的JSON。

Dead Ends

Common approaches that don't work:

  1. 70% fail

    Disabling streaming entirely (stream=False) avoids the issue but defeats the purpose of real-time interaction.

  2. 90% fail

    Manually escaping JSON characters in the function schema doesn't help because the error is in the LLM output, not the schema.

  3. 80% fail

    Increasing temperature or top_p to force more varied output doesn't fix JSON structure issues.