llm type_error ai_generated true

ValidationError: 1 validation error for ToolCall value Input should be a valid integer [type=int_parsing, input_value='42.5', input_type=str]

ID: llm/tool-argument-type-coercion-failure

Also available as: JSON · Markdown · 中文
80%Fix Rate
86%Confidence
1Evidence
2024-03-12First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
pydantic==2.5.0 active
pydantic==2.7.0 active
openai==1.16.0 active
gpt-4-turbo-2024-04-09 active

Root Cause

When using tool calling with structured outputs, the LLM may generate arguments as strings (e.g., '42.5') instead of the expected type (e.g., integer), causing Pydantic or JSON schema validation to fail during type coercion.

generic

中文

使用带有结构化输出的工具调用时,LLM 可能生成字符串形式的参数(例如 '42.5')而不是预期的类型(例如整数),导致 Pydantic 或 JSON schema 验证在类型强制转换期间失败。

Official Documentation

https://docs.pydantic.dev/latest/errors/validation_errors/

Workarounds

  1. 90% success Use Pydantic's `BeforeValidator` to coerce types: `from pydantic import BeforeValidator; def coerce_int(v): return int(float(v)) if isinstance(v, str) else v; class ToolCall(BaseModel): value: Annotated[int, BeforeValidator(coerce_int)]`
    Use Pydantic's `BeforeValidator` to coerce types: `from pydantic import BeforeValidator; def coerce_int(v): return int(float(v)) if isinstance(v, str) else v; class ToolCall(BaseModel): value: Annotated[int, BeforeValidator(coerce_int)]`
  2. 80% success In the tool schema, specify `type: 'number'` instead of `type: 'integer'` to accept both integers and floats as strings.
    In the tool schema, specify `type: 'number'` instead of `type: 'integer'` to accept both integers and floats as strings.
  3. 75% success Add a retry loop with a system prompt instructing the LLM to produce valid JSON with correct types: 'Ensure all numeric fields are numbers, not strings.'
    Add a retry loop with a system prompt instructing the LLM to produce valid JSON with correct types: 'Ensure all numeric fields are numbers, not strings.'

中文步骤

  1. 使用 Pydantic 的 `BeforeValidator` 强制类型转换:`from pydantic import BeforeValidator; def coerce_int(v): return int(float(v)) if isinstance(v, str) else v; class ToolCall(BaseModel): value: Annotated[int, BeforeValidator(coerce_int)]`
  2. 在工具模式中,指定 `type: 'number'` 而不是 `type: 'integer'`,以接受字符串形式的整数和浮点数。
  3. 添加一个重试循环,并附带系统提示,指示 LLM 生成具有正确类型的有效 JSON:'确保所有数字字段是数字,而不是字符串。'

Dead Ends

Common approaches that don't work:

  1. 60% fail

    This may still fail if the string cannot be parsed as the target type (e.g., 'abc' for int) or produces unexpected values.

  2. 55% fail

    This can mask genuine errors like out-of-range values or non-numeric strings.