{
  "id": "llm/llamaindex-embedding-dim-mismatch-update",
  "signature": "ValueError: Embedding dimension mismatch: index has dimension 1536 but new embeddings have dimension 768. Rebuild index or set allow_dangerous_deserialization=True.",
  "signature_zh": "ValueError：嵌入维度不匹配：索引维度为 1536，但新嵌入维度为 768。请重建索引或设置 allow_dangerous_deserialization=True。",
  "regex": "ValueError: Embedding dimension mismatch: index has dimension \\d+ but new embeddings have dimension \\d+\\.",
  "domain": "llm",
  "category": "data_error",
  "subcategory": null,
  "root_cause": "A LlamaIndex vector store index was built with one embedding model (e.g., text-embedding-ada-002, dim 1536) but is being updated with embeddings from a different model (e.g., text-embedding-3-small, dim 768).",
  "root_cause_type": "generic",
  "root_cause_zh": "LlamaIndex 向量存储索引是使用一个嵌入模型（例如 text-embedding-ada-002，维度 1536）构建的，但正在使用不同模型（例如 text-embedding-3-small，维度 768）的嵌入进行更新。",
  "versions": [
    {
      "version": "llama-index 0.10.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "text-embedding-ada-002",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "text-embedding-3-small",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "OpenAI API 2024-02-15",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    }
  ],
  "os_specific": {},
  "dead_ends": [
    {
      "action": "",
      "why_fails": "This flag allows loading a potentially malicious pickle file; it does not fix the dimension mismatch. The index will still reject new embeddings.",
      "fail_rate": 1.0,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "Padding/truncating corrupts the embedding space, leading to meaningless similarity scores and broken retrieval.",
      "fail_rate": 0.95,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "OpenAI deprecated text-embedding-ada-002; the old model may still be accessible but returns different embeddings over time due to model updates.",
      "fail_rate": 0.6,
      "condition": "",
      "sources": []
    }
  ],
  "workarounds": [
    {
      "action": "Rebuild the index from scratch with the new embedding model. In code: `from llama_index.core import VectorStoreIndex, SimpleDirectoryReader; documents = SimpleDirectoryReader('data').load_data(); index = VectorStoreIndex.from_documents(documents, embed_model='text-embedding-3-small'); index.storage_context.persist('new_index')`.",
      "success_rate": 0.95,
      "how": "Rebuild the index from scratch with the new embedding model. In code: `from llama_index.core import VectorStoreIndex, SimpleDirectoryReader; documents = SimpleDirectoryReader('data').load_data(); index = VectorStoreIndex.from_documents(documents, embed_model='text-embedding-3-small'); index.storage_context.persist('new_index')`.",
      "condition": "",
      "sources": []
    },
    {
      "action": "Create a new collection in the vector database (e.g., Chroma) with the correct dimension and re-insert all documents.",
      "success_rate": 0.9,
      "how": "Create a new collection in the vector database (e.g., Chroma) with the correct dimension and re-insert all documents.",
      "condition": "",
      "sources": []
    }
  ],
  "workarounds_zh": [
    "使用新的嵌入模型从头重建索引。代码示例：`from llama_index.core import VectorStoreIndex, SimpleDirectoryReader; documents = SimpleDirectoryReader('data').load_data(); index = VectorStoreIndex.from_documents(documents, embed_model='text-embedding-3-small'); index.storage_context.persist('new_index')`。",
    "在向量数据库（例如 Chroma）中创建具有正确维度的新集合，并重新插入所有文档。"
  ],
  "transition_graph": {
    "leads_to": [],
    "preceded_by": [],
    "frequently_confused_with": []
  },
  "official_doc_url": "https://docs.llamaindex.ai/en/stable/module_guides/indexing/vector_store_index.html",
  "official_doc_section": null,
  "error_code": null,
  "verification_tier": "ai_generated",
  "confidence": 0.87,
  "fix_success_rate": 0.9,
  "resolvable": "true",
  "first_seen": "2024-02-28",
  "last_confirmed": "2024-06-01",
  "last_updated": "2024-06-01",
  "evidence_count": 1,
  "tags": [],
  "locale": "en",
  "aliases": []
}