Clive
Clive

Reputation: 55

Slow query when iterating for ID's over documents with binary data

We are required to iterate over ID's of all documents in a specific collection, these documents contain a binary data field.

When we query with a projection that only returns the _id field, the query takes approximately 80 seconds to return 200000 results.

Is there a faster way to get a cursor that only contains the _id fields and doesn't degrade with the the size of the binary data field? If we remove the data field from all of the documents, the query returns in a few hundred milliseconds.

Steps to reproduce in mongo shell:

function randomString() { 
    var chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghiklmnopqrstuvwxyz"; 
    var randomstring = ''; 
    var string_length = 64000;
    for (var i=0; i<string_length; i++) { 
        var rnum = Math.floor(Math.random() * chars.length); 
        randomstring += chars.substring(rnum,rnum+1); 
    } 
    return randomstring; 
}

var data = randomString()
for (var i = 1; i <= 200000; i++) { db.testData.insert({data: new BinData(0, data)}) }

var cursor = db.testData.find({}, {_id: true})
cursor.forEach(printjson)

Upvotes: 0

Views: 531

Answers (2)

Clive
Clive

Reputation: 55

We found the long query times was due to a possible different sort order than the index, causing mongodb to need to access the documents when running the query, even if the query is covered by the indexes.

Because the index on _id is sorted in ascending order, simply sorting on the find returns the results in just over 100 milliseconds:

var cursor = db.testData.find({}, {_id: true}).sort({_id: 1})

Upvotes: 1

Tost
Tost

Reputation: 51

It seems normal that it takes that long... 75000ms in my case.

Why do you have to iterate through the _id? What are you searching for? Maybe you can add a timestamp or use the timestamp that is integrated in the id. Then you could limit the selection or specify your find-order.

PS.: The aggregate-order has more options, if the find is not enough.

Upvotes: 0

Related Questions