Radek Simko
Radek Simko

Reputation: 16126

Fastest way to update large amount of data

I have milions of rows in mongo collection and need to update all of them. I've written a mongo shell (JS) script like this:

db.Test.find().forEach(function(row) {
    // change data and db.Test.save()
});

which (i guess) should be faster then e.g. updating via any language driver due to possible latency between web server and mongo server itself and just because the fact, that driver is "something on the top" and mongo is "something in the basement".

Even though it can update aproximately 2 100 rec./sec on quad-core 2.27GHz processor with 4GB RAM.

As i know mongoimport can handle around 40k rec./sec (on the same machine), i don't think mentioned speed is anything "fast".

Is there any faster way?

Upvotes: 2

Views: 2491

Answers (1)

Gates VP
Gates VP

Reputation: 45287

There are two possible limiting factors here:

  1. Single write lock: MongoDB only has one write lock, this may be the determining factor.
  2. Disk Access: if data being updated is not actively in memory it will need to be loaded from disk which will cause a slow-down.

Is there any faster way?

The answer here depends on the bottleneck. Try running iostat and mongostat to see where the bottleneck lies. If iostat shows high disk IO, then you're being held back by the disk. If mongostat shows a high "lock%" then you've maxed out access to the global write lock.

If you've maxed out IO, there is no simple code fix. If you've maxed out the write lock, there is no simple code fix. If neither of these is an issue, it may be worth trying another driver.

As i know mongoimport can handle around 40k rec./sec (on the same machine)

This may not be a fair comparison, many people run mongoimport on a fresh database and the data is generally just loaded into RAM.

I would start by checking iostat / mongostat.

Upvotes: 3

Related Questions