# Error: context length exceeded while processing streaming chunks — partial response returned

- **ID:** `llm/context-window-exceeded-with-chunked-streaming`
- **Domain:** llm
- **Category:** runtime_error
- **Verification:** ai_generated
- **Fix Rate:** 80%

## Root Cause

During streaming, cumulative input and output tokens exceed the model's context window, causing the API to truncate the response mid-stream without a clear error.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| openai==1.12.0 | active | — | — |
| anthropic==0.25.0 | active | — | — |
| langchain==0.1.12 | active | — | — |
| gpt-4-turbo-2024-04-09 | active | — | — |
| claude-3-opus-20240229 | active | — | — |

## Workarounds

1. **Before streaming, calculate total tokens using tiktoken (e.g., `import tiktoken; enc = tiktoken.encoding_for_model('gpt-4'); tokens = enc.encode(prompt); if len(tokens) > 120000: truncate prompt`). Truncate the input to leave room for output.** (85% success)
   ```
   Before streaming, calculate total tokens using tiktoken (e.g., `import tiktoken; enc = tiktoken.encoding_for_model('gpt-4'); tokens = enc.encode(prompt); if len(tokens) > 120000: truncate prompt`). Truncate the input to leave room for output.
   ```
2. **Reduce the output length by lowering max_tokens, and implement a loop to resume generation from the last complete sentence if truncated.** (75% success)
   ```
   Reduce the output length by lowering max_tokens, and implement a loop to resume generation from the last complete sentence if truncated.
   ```

## Dead Ends

- **** — Increasing max_tokens in the request doesn't help because the total (input + output) exceeds the model's limit, and max_tokens only caps output. (85% fail)
- **** — Retrying the same request with no changes will reproduce the error since the context is still too large. (95% fail)
- **** — Switching to a different streaming library (e.g., from openai to httpx) doesn't solve the underlying token limit issue. (90% fail)
