{
  "id": "llm/embedding-truncation-mismatch",
  "signature": "Warning: Input text truncated to 8192 tokens for embedding model 'text-embedding-3-small' — embedding quality may degrade",
  "signature_zh": "警告：嵌入模型'text-embedding-3-small'的输入文本被截断为8192个令牌——嵌入质量可能下降",
  "regex": ".*Input text truncated to \\d+ tokens for embedding model.*",
  "domain": "llm",
  "category": "data_error",
  "subcategory": null,
  "root_cause": "Embedding models have a maximum input token limit (e.g., 8192 for text-embedding-3-small); longer inputs are silently truncated, losing semantic information at the end of the text.",
  "root_cause_type": "generic",
  "root_cause_zh": "嵌入模型有最大输入令牌限制（例如 text-embedding-3-small 为 8192）；长输入会被静默截断，丢失文本末尾的语义信息。",
  "versions": [
    {
      "version": "openai>=1.0.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "text-embedding-3-small",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "text-embedding-3-large",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "text-embedding-ada-002",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    }
  ],
  "os_specific": {},
  "dead_ends": [
    {
      "action": "",
      "why_fails": "The embedding API does not accept a max_tokens parameter; truncation is automatic and controlled by model limits",
      "fail_rate": 0.95,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "Averaging embeddings from different chunks loses positional and semantic relationships; not equivalent to a single embedding of the full text",
      "fail_rate": 0.7,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "Truncated embeddings miss critical information from the end of the text, leading to poor retrieval quality in RAG systems",
      "fail_rate": 0.85,
      "condition": "",
      "sources": []
    }
  ],
  "workarounds": [
    {
      "action": "Pre-process input text by truncating to the model's token limit using the same tokenizer (e.g., tiktoken for OpenAI models) before sending to the API, and log the truncation explicitly.",
      "success_rate": 0.9,
      "how": "Pre-process input text by truncating to the model's token limit using the same tokenizer (e.g., tiktoken for OpenAI models) before sending to the API, and log the truncation explicitly.",
      "condition": "",
      "sources": []
    },
    {
      "action": "Use a sliding window or chunking strategy: split long documents into overlapping chunks of max_tokens, embed each chunk separately, and store all embeddings with metadata for retrieval.",
      "success_rate": 0.85,
      "how": "Use a sliding window or chunking strategy: split long documents into overlapping chunks of max_tokens, embed each chunk separately, and store all embeddings with metadata for retrieval.",
      "condition": "",
      "sources": []
    },
    {
      "action": "For RAG pipelines, prioritize embedding the most semantically important parts of the text (e.g., beginning and key sections) rather than relying on automatic truncation of the end.",
      "success_rate": 0.75,
      "how": "For RAG pipelines, prioritize embedding the most semantically important parts of the text (e.g., beginning and key sections) rather than relying on automatic truncation of the end.",
      "condition": "",
      "sources": []
    }
  ],
  "workarounds_zh": [
    "在发送到 API 之前，使用相同的分词器（例如 OpenAI 模型的 tiktoken）将输入文本预截断到模型的令牌限制，并显式记录截断。",
    "使用滑动窗口或分块策略：将长文档分割成 max_tokens 的重叠块，分别嵌入每个块，并将所有嵌入连同元数据存储用于检索。",
    "对于 RAG 管道，优先嵌入文本中语义最重要的部分（例如开头和关键部分），而不是依赖对末尾的自动截断。"
  ],
  "transition_graph": {
    "leads_to": [],
    "preceded_by": [],
    "frequently_confused_with": []
  },
  "official_doc_url": "https://platform.openai.com/docs/guides/embeddings/embedding-models",
  "official_doc_section": null,
  "error_code": null,
  "verification_tier": "ai_generated",
  "confidence": 0.85,
  "fix_success_rate": 0.8,
  "resolvable": "partial",
  "first_seen": "2024-02-20",
  "last_confirmed": "2024-06-01",
  "last_updated": "2024-06-01",
  "evidence_count": 1,
  "tags": [],
  "locale": "en",
  "aliases": []
}