pepsi
pepsi

Reputation: 321

Getting BSONObj size error even with allowDiskUse true option

I have a collection with 300 million documents, each doc has a user_id field like following:

{
    "user_id": "1234567",
    // and other fields
}

I want to a list of unique user_ids in the collection, but the following mongo shell command results in an error.

db.collection.aggregate([
  { $group: { _id: null, user_ids: { $addToSet: "$user_id" } } }
], { allowDiskUse: true });
2021-11-23T14:50:28.163+0900 E  QUERY    [js] uncaught exception: Error: command failed: {
        "ok" : 0,
        "errmsg" : "Error on remote shard <host>:<port> :: caused by :: BSONObj size: 46032166 (0x2BE6526) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: null",
        "code" : 10334,
        "codeName" : "BSONObjectTooLarge",
        "operationTime" : Timestamp(1637646628, 64),
        ...
} : aggregate failed :

Why does the error occur even with allowDiskUse: true option? The db version 4.2.16.

Upvotes: 1

Views: 1284

Answers (2)

R2D2
R2D2

Reputation: 10707

You try to insert all unique user_ids in single document , but apparently the size of this document become greater then16MB causing the issue.

Upvotes: 1

YuTing
YuTing

Reputation: 6629

distinct may be more useful

db.collection.distinct( "user_id" )

Upvotes: 1

Related Questions