llm type_error ai_generated true

ValueError:查询向量维度(384)与索引维度(768)不匹配

ValueError: Query vector dimension (384) does not match index dimension (768)

ID: llm/embedding-dimension-mismatch-in-vector-search

其他格式: JSON · Markdown 中文 · English
90%修复率
87%置信度
1证据数
2023-11-20首次发现

版本兼容性

版本状态引入弃用备注
openai==1.10.0 active
pinecone-client==3.0.0 active
chromadb==0.4.22 active
text-embedding-ada-002 active
all-MiniLM-L6-v2 active
sentence-transformers==2.2.2 active

根因分析

用于查询编码的嵌入模型与用于构建向量索引的模型输出维度不同,通常是由于模型切换(例如,从text-embedding-ada-002切换到较小的模型)。

English

Embedding model used for query encoding has a different output dimension than the model used to build the vector index, often due to switching between models (e.g., from text-embedding-ada-002 to a smaller model).

generic

官方文档

https://platform.openai.com/docs/guides/embeddings

解决方案

  1. 使用与查询相同的嵌入模型重新嵌入所有文档。例如,如果使用text-embedding-ada-002(1536维度),确保索引和查询都使用该模型。
  2. 使用维度无关的向量数据库(例如,带有自动检测功能的ChromaDB),通过存储元数据并在查询时过滤来处理不同维度。

无效尝试

常见但无效的做法:

  1. 90% 失败

    Resizing the query vector by padding with zeros or truncating corrupts the embedding and leads to poor search results.

  2. 70% 失败

    Re-indexing all documents with the new model is correct but often skipped due to time constraints; partial re-indexing causes inconsistency.

  3. 95% 失败

    Assuming the vector database automatically handles dimension conversion leads to silent failures or errors.