How to process documents in batch while not timing out on node mongodb cursor

Question

The scenario I have is say I have 2 million documents in mongo and I want to process it in batch of say 100 or 1000 (coz the v8 memory is scarce ) and after reading the batch size of the documents I want to do some computation and write it to a file which might take longer than 10 minutes before I come and get the next set of batch size documents. How can I do that with node.js mongo db driver?

I couldn't find all the methods I need in node.js mongo db driver.for example mongo shell has docs.leftInTheBatch which tells how many documents are left in the current batch and this is not available in node.js.

Another important functionality I was looking for in node.js mongo db driver is how to set the cursor to not timeout (This is possible in mongo shell and other language drivers but I am not sure on node.js)?

   var hash_map = {}; 
    db.collection(collection_name).find().batchSize(100).each(function(err, docs) {
        docs.each(function(err, doc) {
            var id = doc._id; // assume this is a string not objectID
            hash_map[id] = doc.key1;
        })
        // This async function would take say 20 minutes or just assume it takes long time. now, would the cursor time out before I retrieve the next batch?
        async.series([ 
            prcocessData.bind(null, hash_map),
            writeDataToFile
        ], function(err){
           if(err) throw err;
           return callback();
        });
    });

How to process documents in batch while not timing out on node mongodb cursor

Answers (1)

Related Questions