llm protocol_error ai_generated true

json.decoder.JSONDecodeError: 期望属性名称用双引号括起来:在解析流式函数调用参数时,第 1 行第 1024 列(字符 1023)

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 1024 (char 1023) when parsing streaming function call arguments

ID: llm/function-call-arguments-truncated-in-streaming

其他格式: JSON · Markdown 中文 · English
85%修复率
89%置信度
1证据数
2024-04-01首次发现

版本兼容性

版本状态引入弃用备注
openai==1.30.0 active
anthropic==0.25.0 active
mistralai==0.1.0 active

根因分析

当流式传输函数调用时,API 会分块发送参数。如果参数字符串很长,块边界可能会分割 JSON 令牌(例如,字符串值或键),导致在增量解析时累积的参数成为无效 JSON。

English

When streaming function calls, the API sends the arguments in chunks. If the argument string is long, a chunk boundary can split a JSON token (e.g., a string value or key), causing the accumulated arguments to be invalid JSON when parsed incrementally.

generic

官方文档

https://platform.openai.com/docs/guides/function-calling#streaming-function-calls

解决方案

  1. 在尝试解析之前,累积函数调用的所有块。等待 'finish_reason' 为 'stop' 或 'function_call' 后再解析完整的参数字符串。示例:
    
    full_args = ""
    for chunk in stream:
        delta = chunk.choices[0].delta
        if delta.function_call and delta.function_call.arguments:
            full_args += delta.function_call.arguments
        if chunk.choices[0].finish_reason:
            break
    import json
    parsed_args = json.loads(full_args)
  2. 使用 '增量解析' 方法,配合 JSON 修复库(如 'json-repair' 或 'json5')处理部分 JSON。示例:
    
    import json5
    partial_args = ""
    for chunk in stream:
        delta = chunk.choices[0].delta
        if delta.function_call and delta.function_call.arguments:
            partial_args += delta.function_call.arguments
            try:
                parsed = json5.loads(partial_args)
                # 增量使用解析结果
            except Exception:
                pass
  3. 设置 API 参数 'stream_options' 为 {'include_usage': True},并使用 'usage' 字段检测函数调用参数的结束,然后解析完整字符串。

无效尝试

常见但无效的做法:

  1. Using json.loads() on each individual chunk instead of accumulating 95% 失败

    Individual chunks are not valid JSON; they are partial fragments. json.loads() will always fail on incomplete chunks.

  2. Increasing max_tokens to reduce chunking 80% 失败

    max_tokens controls the output length, not the chunk size. The API still sends chunks of arbitrary size regardless of max_tokens.

  3. Setting stream_options={'include_usage': True} 90% 失败

    This option controls whether usage information is included in the stream, it has no effect on how function call arguments are chunked or parsed.