llm protocol_error ai_generated true

json.decoder.JSONDecodeError: 期望属性名称用双引号括起来：在解析流式函数调用参数时，第 1 行第 1024 列（字符 1023）

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 1024 (char 1023) when parsing streaming function call arguments

ID: llm/function-call-arguments-truncated-in-streaming

其他格式: JSON · Markdown 中文 · English

85%修复率

89%置信度

1证据数

2024-04-01首次发现

版本兼容性

版本	状态	引入	弃用	备注
openai==1.30.0	active	—	—	—
anthropic==0.25.0	active	—	—	—
mistralai==0.1.0	active	—	—	—

根因分析

当流式传输函数调用时，API 会分块发送参数。如果参数字符串很长，块边界可能会分割 JSON 令牌（例如，字符串值或键），导致在增量解析时累积的参数成为无效 JSON。

English

When streaming function calls, the API sends the arguments in chunks. If the argument string is long, a chunk boundary can split a JSON token (e.g., a string value or key), causing the accumulated arguments to be invalid JSON when parsed incrementally.

generic

官方文档

https://platform.openai.com/docs/guides/function-calling#streaming-function-calls

解决方案

在尝试解析之前，累积函数调用的所有块。等待 'finish_reason' 为 'stop' 或 'function_call' 后再解析完整的参数字符串。示例：

full_args = ""
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.function_call and delta.function_call.arguments:
        full_args += delta.function_call.arguments
    if chunk.choices[0].finish_reason:
        break
import json
parsed_args = json.loads(full_args)

使用 '增量解析' 方法，配合 JSON 修复库（如 'json-repair' 或 'json5'）处理部分 JSON。示例：

import json5
partial_args = ""
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.function_call and delta.function_call.arguments:
        partial_args += delta.function_call.arguments
        try:
            parsed = json5.loads(partial_args)
            # 增量使用解析结果
        except Exception:
            pass

设置 API 参数 'stream_options' 为 {'include_usage': True}，并使用 'usage' 字段检测函数调用参数的结束，然后解析完整字符串。

无效尝试

常见但无效的做法:

Using json.loads() on each individual chunk instead of accumulating 95% 失败
Individual chunks are not valid JSON; they are partial fragments. json.loads() will always fail on incomplete chunks.
Increasing max_tokens to reduce chunking 80% 失败
max_tokens controls the output length, not the chunk size. The API still sends chunks of arbitrary size regardless of max_tokens.
Setting stream_options={'include_usage': True} 90% 失败
This option controls whether usage information is included in the stream, it has no effect on how function call arguments are chunked or parsed.