mongodb resource_error ai_generated true

MongoServerError: PlanExecutor error: aggregation stage 'lookup' caused memory pressure: estimated size 200MB exceeds 100MB limit

ID: mongodb/aggregation-lookup-unwind-memory-pressure

Also available as: JSON · Markdown · 中文
82%Fix Rate
86%Confidence
1Evidence
2024-09-30First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
mongodb 6.0 active
mongodb 7.0 active
mongodb 8.0 active

Root Cause

A $lookup stage with $unwind on large collections exceeded the 100MB memory limit for intermediate results, often due to missing indexes or cartesian product joins.

generic

中文

在大型集合上使用$lookup和$unwind阶段导致中间结果超过100MB内存限制,通常是由于缺少索引或笛卡尔积连接。

Official Documentation

https://www.mongodb.com/docs/manual/reference/operator/aggregation/lookup/#memory-considerations

Workarounds

  1. 90% success Create an index on the foreign field used in $lookup: `db.orders.createIndex({ customerId: 1 })` to reduce the size of matched documents and avoid full collection scans.
    Create an index on the foreign field used in $lookup: `db.orders.createIndex({ customerId: 1 })` to reduce the size of matched documents and avoid full collection scans.
  2. 85% success Restructure the pipeline: use $lookup with a pipeline to filter documents before joining, e.g., `{ $lookup: { from: 'orders', let: { custId: '$_id' }, pipeline: [ { $match: { $expr: { $eq: ['$customerId', '$$custId'] } } }, { $limit: 100 } ], as: 'orders' } }` to limit matched documents.
    Restructure the pipeline: use $lookup with a pipeline to filter documents before joining, e.g., `{ $lookup: { from: 'orders', let: { custId: '$_id' }, pipeline: [ { $match: { $expr: { $eq: ['$customerId', '$$custId'] } } }, { $limit: 100 } ], as: 'orders' } }` to limit matched documents.
  3. 80% success Split the aggregation: perform $lookup in a separate aggregation, write results to an intermediate collection, then run $unwind on that smaller dataset.
    Split the aggregation: perform $lookup in a separate aggregation, write results to an intermediate collection, then run $unwind on that smaller dataset.

中文步骤

  1. Create an index on the foreign field used in $lookup: `db.orders.createIndex({ customerId: 1 })` to reduce the size of matched documents and avoid full collection scans.
  2. Restructure the pipeline: use $lookup with a pipeline to filter documents before joining, e.g., `{ $lookup: { from: 'orders', let: { custId: '$_id' }, pipeline: [ { $match: { $expr: { $eq: ['$customerId', '$$custId'] } } }, { $limit: 100 } ], as: 'orders' } }` to limit matched documents.
  3. Split the aggregation: perform $lookup in a separate aggregation, write results to an intermediate collection, then run $unwind on that smaller dataset.

Dead Ends

Common approaches that don't work:

  1. 85% fail

    allowDiskUse does not apply to intermediate results of $lookup before $unwind; it only helps for sorting and grouping stages after the join.

  2. 95% fail

    The 100MB limit is hard-coded for $lookup intermediate results and cannot be changed; attempting to set it has no effect.

  3. 70% fail

    While this may reduce memory, it changes the output structure; the root cause is the large join size, not the unwind itself.