mpc75
mpc75

Reputation: 989

What is the best way to slowly (500/second) update many documents in a MongoDB database with a new field and value?

I have a MongoDB database that I want to update about 100,000 documents with a "score" for each on a daily basis. The challenge with the way I have implemented it is that it tries to update them really really fast (about 2,000 updates per second) and my MongoDB limits are set to only 500 updates per second (M5 tier) so MongoDB is sporadically throwing an error back to me (I confirmed with MongoDB support that this why I'm getting the error sometimes).

Is there a way to perhaps batch the updates or a better way to do what I'm doing?

Here's the code I am using. If I just turn it off when I get an error and start it back up it will eventually update all the documents, but that's an unsustainable solution:

await client
  .db("test")
  .collection("collection_name")
  .find({ score: { $exists: false } })
  .forEach(async data => {
    await client
      .db("test")
      .collection("collection_name")
      .updateOne(
        { _id: data._id },
        {
          $set: {
            score: GetScore(data)
          }
        }
      );
  });
client.close();

Upvotes: 0

Views: 214

Answers (1)

abondoa
abondoa

Reputation: 1783

One problem might be that the callback to forEach is likely not awaited from the mongo library, therefore multiple of your queries will be issued concurrently - query two will be issued before query one is finished etc.

You could use a combination of next and hasNext on the cursor combined with awaiting a a promise that resolves later (might not be needed) instead of doing forEach, like so:

var cursor = await client
  .db("test")
  .collection("collection_name")
  .find({ score: { $exists: false } });
while(await cursor.hasNext()) {
  var data = await cursor.next();
  await client
    .db("test")
    .collection("collection_name")
    .updateOne(
      { _id: data._id },
        {
        $set: {
          score: GetScore(data)
        }
      }
    );
}

Docs: http://mongodb.github.io/node-mongodb-native/3.5/api/Cursor.html#next http://mongodb.github.io/node-mongodb-native/3.5/api/Cursor.html#hasNext

Again, the "sleep" might actually not be necessary when you get your queries to run sequentially.

Upvotes: 1

Related Questions