llm
data_error
ai_generated
partial
ValidationError: 1 validation error for ResponseModel color Input should be 'red', 'green', or 'blue' [type=enum, input_value='purple', input_type=str]
ID: llm/llm-structured-output-enum-violation-streaming
75%Fix Rate
82%Confidence
1Evidence
2024-04-05First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| openai 1.12.0 | active | — | — | — |
| openai 1.13.0 | active | — | — | — |
| pydantic 2.5.0 | active | — | — | — |
Root Cause
LLM generates enum values outside the allowed set when using structured output with streaming, due to incomplete constraint enforcement during partial token generation.
generic中文
在流式处理中使用结构化输出时,由于部分令牌生成期间约束执行不完整,LLM生成超出允许集合的枚举值。
Official Documentation
https://platform.openai.com/docs/guides/structured-outputsWorkarounds
-
85% success Use post-processing to map invalid values to nearest valid enum: valid_colors = {'red','green','blue'}; if output.color not in valid_colors: output.color = 'blue' # fallback
Use post-processing to map invalid values to nearest valid enum: valid_colors = {'red','green','blue'}; if output.color not in valid_colors: output.color = 'blue' # fallback -
95% success Switch to non-streaming mode for structured outputs: response = client.chat.completions.create(model='gpt-4', response_format={'type':'json_object'}, stream=False)
Switch to non-streaming mode for structured outputs: response = client.chat.completions.create(model='gpt-4', response_format={'type':'json_object'}, stream=False)
中文步骤
Use post-processing to map invalid values to nearest valid enum: valid_colors = {'red','green','blue'}; if output.color not in valid_colors: output.color = 'blue' # fallbackSwitch to non-streaming mode for structured outputs: response = client.chat.completions.create(model='gpt-4', response_format={'type':'json_object'}, stream=False)
Dead Ends
Common approaches that don't work:
-
Setting temperature to 0 to reduce randomness
80% fail
Enum violations occur due to token-level decoding constraints, not sampling randomness.
-
Increasing max_tokens hoping for complete output
90% fail
More tokens don't fix constraint enforcement; the model still generates invalid values.