Reputation: 839
Hello,
I use Node.js to provide an API for storing data on a MongoDB database.
I ran multiple tests on a read
method, which takes ids and returns the corresponding documents. The point is that I must return these documents in the specified order. To ensure that, I use the following code:
// Sequentially fetch every element
function read(ids, callback) {
var i = 0;
var results = [];
function next() {
db.findOne(ids[i], function (err, doc) {
results.push(err ? null : doc);
if (ids.length > ++i) {
return next();
}
callback(results);
});
}
next();
}
This way, documents are fetched one-by-one, in the right order. It takes about 11s on my laptop to retrieve 27k documents.
However, I thought that it was possible to improve this method:
// Asynchronously map the whole array
var async = require('async');
function read(ids, callback) {
async.map(ids, db.findOne.bind(db), callback):
}
After running a single test, I was quite satisfied seeing that the 27k documents were retrieved in only 8s using simpler code.
The problem happens when I repeat the same request: the response time keeps growing (proportionally to the number of elements retrieved): 9s 10s 11s 12s...
. This problem does not happen in the sequential version.
I tried two versions of Node.js, v6.2.0 and v0.10.29. The problem is the same. What causes this latency and how could I suppress it?
Upvotes: 4
Views: 365
Reputation: 14992
Try to use async.mapLimit
to prevent overload. You need some tests to tune limit value with your environment.
But find({_id: {$in: list}})
is always better, because single database request instead of multiple.
I suggest you to try to perform restore of original order client-side.
Something like this:
function read(ids, cb) {
db.find(
{_id: {$in: ids.map(id => mongoose.Types.ObjectId(id))}},
process
);
function process(err, docs) {
if (err) return cb(err);
return cb(null, docs.sort(ordering))
}
function ordering(a, b) {
return ids.indexOf(b._id.toString()) - ids.indexOf(a._id.toString());
}
}
May be, find query needs to be corrected, I can't to know what exact mongodb driver you use.
This code is first-try, more manual sorting can improve performance alot. [].indexOf
is heavy too(O(n)
).
But I'm almost sure, even as-is now, it will work much faster.
Possible ordering
replacement:
var idHash = {};
for(var i = 0; i < ids.length; i++)
idHash[ids[i]] = i;
function ordering(a, b) {
return idHash[b._id.toString()] - idHash[a._id.toString()];
}
Any sort algorithm has O(nlogn)
in best case, but we already know result position of each found document, so, we can restore original order by O(n)
:
var idHash = ids.reduce((c, id, i) => (c[id] = i, c), {});
function process(err, docs) {
if (err) return cb(err);
return cb(null,
docs.reduce(
(c, doc) => (c[idHash[doc._id.toString()]] = doc, c),
ids.map(id => null))) //fill not_found docs by null
}
Functional style makes code flexier. For example this code can be easy modified to use async.reduce
to be less sync-blocking.
Upvotes: 4