# chromadb.errors.DimensionError: 插入的嵌入向量维度 (1536) 与集合维度 (768) 不匹配

- **ID:** `llm/embedding-length-mismatch-on-insert`
- **领域:** llm
- **类别:** data_error
- **验证级别:** ai_generated
- **修复率:** 95%

## 根因

用于插入的嵌入模型产生的向量大小与集合期望的维度不同，通常是由于切换了嵌入模型或模型版本不匹配。

## 版本兼容性

| 版本 | 状态 | 引入 | 弃用 |
|------|------|------|------|
| chromadb>=0.4.0 | active | — | — |
| sentence-transformers>=2.2.0 | active | — | — |
| text-embedding-3-small | active | — | — |
| text-embedding-ada-002 | active | — | — |

## 解决方案

1. ```
   Create a new collection with the correct dimension and re-embed all documents. Example: `collection = client.create_collection(name="my_collection", embedding_function=embedding_function, metadata={"hnsw:space": "cosine"})` where `embedding_function` outputs 1536 dimensions.
   ```
2. ```
   If using a different embedding model temporarily, keep a mapping of model to collection, or use a router that selects the correct collection based on the model.
   ```
3. ```
   Use a unified embedding model that supports variable dimensions (e.g., text-embedding-3-small with `dimensions` parameter) to enforce consistency.
   ```

## 无效尝试

- **** — The collection dimension is fixed at creation time; upserting doesn't change the schema. (100% 失败率)
- **** — Padding or truncating destroys semantic meaning and leads to poor retrieval results; the vector space becomes inconsistent. (95% 失败率)
- **** — Different models have different output dimensions; you must use the same model for all inserts in a collection. (90% 失败率)
