{ "id": "llm/truncated-response-in-streaming-mode", "signature": "Error: Incomplete stream response - expected more data but connection closed unexpectedly.", "signature_zh": "Error: 不完整的流式响应 - 预期更多数据但连接意外关闭。", "regex": "Incomplete stream response|connection closed unexpectedly|stream.*truncated", "domain": "llm", "category": "network_error", "subcategory": null, "root_cause": "The network connection to the LLM API was interrupted mid-stream due to timeout, server restart, or proxy issues, resulting in a truncated response.", "root_cause_type": "generic", "root_cause_zh": "由于超时、服务器重启或代理问题，LLM API的网络连接在流传输过程中中断，导致响应被截断。", "versions": [ { "version": "openai>=1.0.0", "introduced": null, "deprecated": null, "removed": null, "behavior_change": null, "status": "active" }, { "version": "httpx>=0.24.0", "introduced": null, "deprecated": null, "removed": null, "behavior_change": null, "status": "active" } ], "os_specific": {}, "dead_ends": [ { "action": "", "why_fails": "The error can be caused by server-side issues or network instability, not just client timeout. High timeouts may mask other problems.", "fail_rate": 0.4, "condition": "", "sources": [] }, { "action": "", "why_fails": "Non-streaming calls may also fail with connection errors, and they don't address the root cause of network instability.", "fail_rate": 0.3, "condition": "", "sources": [] } ], "workarounds": [ { "action": "Implement retry logic with exponential backoff for stream interruptions:\nimport time\nfrom openai import OpenAI\n\nclient = OpenAI()\nmax_retries = 3\nfor attempt in range(max_retries):\n try:\n stream = client.chat.completions.create(\n model='gpt-4',\n messages=[{'role': 'user', 'content': 'Hello'}],\n stream=True\n )\n for chunk in stream:\n print(chunk.choices[0].delta.content or '', end='')\n break\n except Exception as e:\n if 'Incomplete stream' in str(e):\n wait = 2 ** attempt\n print(f'Retrying in {wait}s...')\n time.sleep(wait)\n else:\n raise", "success_rate": 0.85, "how": "Implement retry logic with exponential backoff for stream interruptions:\nimport time\nfrom openai import OpenAI\n\nclient = OpenAI()\nmax_retries = 3\nfor attempt in range(max_retries):\n try:\n stream = client.chat.completions.create(\n model='gpt-4',\n messages=[{'role': 'user', 'content': 'Hello'}],\n stream=True\n )\n for chunk in stream:\n print(chunk.choices[0].delta.content or '', end='')\n break\n except Exception as e:\n if 'Incomplete stream' in str(e):\n wait = 2 ** attempt\n print(f'Retrying in {wait}s...')\n time.sleep(wait)\n else:\n raise", "condition": "", "sources": [] }, { "action": "Use a more robust HTTP client with connection pooling and keep-alive:\nfrom httpx import Client, Limits\n\nlimits = Limits(max_keepalive_connections=5, keepalive_expiry=30.0)\nwith Client(limits=limits) as client:\n try:\n response = client.post(\n 'https://api.openai.com/v1/chat/completions',\n json={'model': 'gpt-4', 'messages': [{'role': 'user', 'content': 'Hello'}], 'stream': True},\n headers={'Authorization': 'Bearer YOUR_API_KEY'},\n timeout=30.0\n )\n for line in response.iter_lines():\n if line:\n print(line)\n except Exception as e:\n print(f'Stream error: {e}')", "success_rate": 0.8, "how": "Use a more robust HTTP client with connection pooling and keep-alive:\nfrom httpx import Client, Limits\n\nlimits = Limits(max_keepalive_connections=5, keepalive_expiry=30.0)\nwith Client(limits=limits) as client:\n try:\n response = client.post(\n 'https://api.openai.com/v1/chat/completions',\n json={'model': 'gpt-4', 'messages': [{'role': 'user', 'content': 'Hello'}], 'stream': True},\n headers={'Authorization': 'Bearer YOUR_API_KEY'},\n timeout=30.0\n )\n for line in response.iter_lines():\n if line:\n print(line)\n except Exception as e:\n print(f'Stream error: {e}')", "condition": "", "sources": [] } ], "workarounds_zh": [ "为流中断实现带指数退避的重试逻辑：\nimport time\nfrom openai import OpenAI\n\nclient = OpenAI()\nmax_retries = 3\nfor attempt in range(max_retries):\n try:\n stream = client.chat.completions.create(\n model='gpt-4',\n messages=[{'role': 'user', 'content': '你好'}],\n stream=True\n )\n for chunk in stream:\n print(chunk.choices[0].delta.content or '', end='')\n break\n except Exception as e:\n if 'Incomplete stream' in str(e):\n wait = 2 ** attempt\n print(f'{wait}秒后重试...')\n time.sleep(wait)\n else:\n raise", "使用更健壮的HTTP客户端，带有连接池和保持活动连接：\nfrom httpx import Client, Limits\n\nlimits = Limits(max_keepalive_connections=5, keepalive_expiry=30.0)\nwith Client(limits=limits) as client:\n try:\n response = client.post(\n 'https://api.openai.com/v1/chat/completions',\n json={'model': 'gpt-4', 'messages': [{'role': 'user', 'content': '你好'}], 'stream': True},\n headers={'Authorization': 'Bearer YOUR_API_KEY'},\n timeout=30.0\n )\n for line in response.iter_lines():\n if line:\n print(line)\n except Exception as e:\n print(f'流错误: {e}')" ], "transition_graph": { "leads_to": [], "preceded_by": [], "frequently_confused_with": [] }, "official_doc_url": "https://platform.openai.com/docs/guides/streaming", "official_doc_section": null, "error_code": null, "verification_tier": "ai_generated", "confidence": 0.85, "fix_success_rate": 0.8, "resolvable": "partial", "first_seen": "2023-04-10", "last_confirmed": "2024-06-01", "last_updated": "2024-06-01", "evidence_count": 1, "tags": [], "locale": "en", "aliases": [] }