# ValueError: Embedding dimension mismatch: index has dimension 1536 but new embeddings have dimension 768. Rebuild index or set allow_dangerous_deserialization=True.

- **ID:** `llm/llamaindex-embedding-dim-mismatch-update`
- **Domain:** llm
- **Category:** data_error
- **Verification:** ai_generated
- **Fix Rate:** 90%

## Root Cause

A LlamaIndex vector store index was built with one embedding model (e.g., text-embedding-ada-002, dim 1536) but is being updated with embeddings from a different model (e.g., text-embedding-3-small, dim 768).

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| llama-index 0.10.0 | active | — | — |
| text-embedding-ada-002 | active | — | — |
| text-embedding-3-small | active | — | — |
| OpenAI API 2024-02-15 | active | — | — |

## Workarounds

1. **Rebuild the index from scratch with the new embedding model. In code: `from llama_index.core import VectorStoreIndex, SimpleDirectoryReader; documents = SimpleDirectoryReader('data').load_data(); index = VectorStoreIndex.from_documents(documents, embed_model='text-embedding-3-small'); index.storage_context.persist('new_index')`.** (95% success)
   ```
   Rebuild the index from scratch with the new embedding model. In code: `from llama_index.core import VectorStoreIndex, SimpleDirectoryReader; documents = SimpleDirectoryReader('data').load_data(); index = VectorStoreIndex.from_documents(documents, embed_model='text-embedding-3-small'); index.storage_context.persist('new_index')`.
   ```
2. **Create a new collection in the vector database (e.g., Chroma) with the correct dimension and re-insert all documents.** (90% success)
   ```
   Create a new collection in the vector database (e.g., Chroma) with the correct dimension and re-insert all documents.
   ```

## Dead Ends

- **** — This flag allows loading a potentially malicious pickle file; it does not fix the dimension mismatch. The index will still reject new embeddings. (100% fail)
- **** — Padding/truncating corrupts the embedding space, leading to meaningless similarity scores and broken retrieval. (95% fail)
- **** — OpenAI deprecated text-embedding-ada-002; the old model may still be accessible but returns different embeddings over time due to model updates. (60% fail)
