Reputation: 1318
I have a collection of about 1M documents. Each document has internalNumber
property and I need to get all internalNumber
s in my node.js code.
Previously I was using
db.docs.distinct("internalNumber")
or
collection.distinct('internalNumber', {}, {},(err, result) => { /* ... */ })
in Node.
But with the growth of the collection I started to get the error: distinct is too big, 16m cap
.
Now I want to use aggregation. It consumes a lot of memory and it is slow, but it is OK since I need to do it only once at the script startup. I've tried following in Robo 3T GUI tool:
db.docs.aggregate([{$group: {_id: '$internalNumber'} }]);
It works, and I wanted to use it in node.js code the following way:
collection.aggregate([{$group: {_id: '$internalNumber'} }],
(err, docs) => { /* ... * });
But in Node I get an error: "MongoError: aggregation result exceeds maximum document size (16MB) at Function.MongoError.create"
.
Please help to overcome that limit.
Upvotes: 4
Views: 1589
Reputation: 297
For Casbah users:
val pipeline = ...
collection.aggregate(pipeline, AggregationOptions(batchSize = 500, outputMode = AggregationOptions.CURSOR)
Upvotes: 0
Reputation: 151122
The problem is that the native driver differs from how the shell method is working by default in that the "shell" is actually returning a "cursor" object where the native driver needs this option "explicitly".
Without a "cursor", .aggregate()
returns a single BSON document as an array of documents, so we turn it into a cursor to avoid the limitation:
let cursor = collection.aggregate(
[{ "$group": { "_id": "$internalNumber" } }],
{ "cursor": { "batchSize": 500 } }
);
cursor.toArray((err,docs) => {
// work with resuls
});
Then you can use regular methods like .toArray()
to make the results a JavaScript array which on the 'client' does not share the same limitations, or other methods for iterating a "cursor".
Upvotes: 4