# Warning: Input text truncated to 8192 tokens for embedding model 'text-embedding-3-small' — embedding quality may degrade

- **ID:** `llm/embedding-truncation-mismatch`
- **Domain:** llm
- **Category:** data_error
- **Verification:** ai_generated
- **Fix Rate:** 80%

## Root Cause

Embedding models have a maximum input token limit (e.g., 8192 for text-embedding-3-small); longer inputs are silently truncated, losing semantic information at the end of the text.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| openai>=1.0.0 | active | — | — |
| text-embedding-3-small | active | — | — |
| text-embedding-3-large | active | — | — |
| text-embedding-ada-002 | active | — | — |

## Workarounds

1. **Pre-process input text by truncating to the model's token limit using the same tokenizer (e.g., tiktoken for OpenAI models) before sending to the API, and log the truncation explicitly.** (90% success)
   ```
   Pre-process input text by truncating to the model's token limit using the same tokenizer (e.g., tiktoken for OpenAI models) before sending to the API, and log the truncation explicitly.
   ```
2. **Use a sliding window or chunking strategy: split long documents into overlapping chunks of max_tokens, embed each chunk separately, and store all embeddings with metadata for retrieval.** (85% success)
   ```
   Use a sliding window or chunking strategy: split long documents into overlapping chunks of max_tokens, embed each chunk separately, and store all embeddings with metadata for retrieval.
   ```
3. **For RAG pipelines, prioritize embedding the most semantically important parts of the text (e.g., beginning and key sections) rather than relying on automatic truncation of the end.** (75% success)
   ```
   For RAG pipelines, prioritize embedding the most semantically important parts of the text (e.g., beginning and key sections) rather than relying on automatic truncation of the end.
   ```

## Dead Ends

- **** — The embedding API does not accept a max_tokens parameter; truncation is automatic and controlled by model limits (95% fail)
- **** — Averaging embeddings from different chunks loses positional and semantic relationships; not equivalent to a single embedding of the full text (70% fail)
- **** — Truncated embeddings miss critical information from the end of the text, leading to poor retrieval quality in RAG systems (85% fail)
