Reputation: 943
I need to change a field type from int32 to string. This change is for data in a server and for a huge amount of documents.
With a simple update like the following only a small part of the documents get updated due to time issues:
db.collection.find({"identifier": {$exists:true}})
.forEach( function(x) {
db.collection.update({_id: x._id}, {$set: {"identifier":
x.identifier.toString()}});
}
);
So I decided to do a bulk change:
var bulk = db.collection.initializeUnorderedBulkOp();
bulk.find({"identifier": {$exists:true}}).update(
function(x) {
{_id: x._id}, {$set: {"identifier": x.identifier.toString()}}
});
bulk.execute();
But it gives an error and does not get executed.
How should I do the update for the bulk to work?
Upvotes: 0
Views: 1825
Reputation: 7496
There is no bulk update where you can define a function within the official docs. What you can do yourself is recreate the bulk operation by using skip and limit.
For this to work, you will have to define the skip and limit values that you want to use. If you are going to be updating using a batch size of 100, then the limit will always be 100, but the skip will be increasing by 100 every time you run the query.
Ex, first run.
db.collection.find({"identifier": {$exists:true}}).sort({_id:1}).skip(0).limit(100)
.forEach( function(x) {
db.collection.update({_id: x._id}, {$set: {"identifier":
x.identifier.toString()}});
}
);
Ex, second run.
db.collection.find({"identifier": {$exists:true}}).sort({_id:1}).skip(100).limit(100)
.forEach( function(x) {
db.collection.update({_id: x._id}, {$set: {"identifier":
x.identifier.toString()}});
}
);
Ex, third run.
db.collection.find({"identifier": {$exists:true}}).sort({_id:1}).skip(200).limit(100)
.forEach( function(x) {
db.collection.update({_id: x._id}, {$set: {"identifier":
x.identifier.toString()}});
}
);
This way you can control what is being done for every batch of size 100.
Remember to ALWAYS sort before skipping and limiting, or else you would have random results in the skip operation. You can sort with whatever criteria you want.
You could also help the process if the find operation filters the results that need to be converted:
db.collection.find({"identifier": {$exists:true, $not {$type: "string"} }})
.forEach( function(x) {
db.collection.update({_id: x._id}, {$set: {"identifier":
x.identifier.toString()}});
}
);
But don't combine both approaches, choose one or the other (because of the results of the find operation).
Hope this helps.
Upvotes: 1