Ernani
Ernani

Reputation: 1049

NodeJS MongoDB avoid Cursor Timeout

I would like to loop throw all documents on a specific collection of my MongoDB. However every attempt I made failed due to the timeout of the cursor. Here is my code

let MongoClient = require('mongodb').MongoClient;
const url = "my connection URI"
let options = { socketTimeoutMS: 120000, connectTimeoutMS: 120000, keepAlive: 100, poolSize: 5 }

MongoClient.connect(url, options,
  function(err, db) {
  if (err) throw err
  let dbo = db.db("notes")
  let collection = dbo.collection("stats-network-consumption")

  let stream = collection.find({}, { timeout: false }).stream()

  stream.on("data", function(item) {
    printTask(item)
  })

  stream.on('error', function (err) {
    console.error(err)
  })

  stream.on("end", function() {
    console.log("DONE!")
    db.close()
  })

})

The code above runs for about 15 seconds and retrieves between 6000 to 8000 documents and then throws the following error:

{ MongoError: cursor does not exist, was killed or timed out
    at queryCallback (/Volumes/safezone/development/workspace-router/migration/node_modules/mongodb-core/lib/wireprotocol/2_6_support.js:136:23)
    at /Volumes/safezone/development/workspace-router/migration/node_modules/mongodb-core/lib/connection/pool.js:541:18
    at process._tickCallback (internal/process/next_tick.js:150:11)
  name: 'MongoError',
  message: 'cursor does not exist, was killed or timed out' }

I need to retrieve around 50000 documents so I will need to find a way to avoid the cursor timeout.

As seen on the code above, I've tried to increase the socketTimeoutMS and the connectTimeoutMS, which had no effect on the cursor timeout.

I also have tried to replace stream with a forEach and add .addCursorFlag('noCursorTimeout', true) which also did not help.

I've tried everything I found about mongodb, I did not tried mongoose or alternatives because they use schemas and I'll later have to update the current type of an attribute (which can be tricky with the mongoose schemas).

Upvotes: 2

Views: 4369

Answers (1)

kevinadi
kevinadi

Reputation: 13775

Having a cursor with no timeout is generally not recommended.

The reason is, the cursor won't ever be closed by the server, so if your app crashed and you restart it, it will open another no timeout cursor on the server. Recycle your app often enough, and those will add up.

No timeout cursor on a sharded cluster would also prevent chunk migration.

If you need to retrieve big results, the cursor should not timeout since the results will be sent in batches, and the cursor would be reused to get the next batch.

The standard cursor timeout is 10 minutes, so it is possible to lose the cursor if you need more than 10 minutes to process a batch.

In your code example, your use of stream() might be interfering with your intent. Try using each() (example here) on the cursor instead.

If you need to monitor a collection for changes, you might want to take a look at Change Streams which is a new feature in MongoDB 3.6.

For example, your code may be able to be modified like:

let collection = dbo.collection("stats-network-consumption")
let stream = collection.watch()
document = next(stream)

Note that to enable change stream support, the driver you're using must support MongoDB 3.6 features and the watch() method. See Driver Compatibility Page for details.

Upvotes: 2

Related Questions