Jim GB
Jim GB

Reputation: 93

Efficient way to modify the time format of a field for all documents in MongoDB

I have a collection contains three hundred million documents. Each document has a "created_at" field that specifies the time in a string format like this 'Thu Feb 05 09:25:38 +0000 2015'

I want to change all the "created_at" field to a MongoDB supported time format. So I wrote a simple Ruby script:

collection.find.each do |document|
  document[:created_at] = Time.parse document[:created_at]
  collection.save(document)
end

It did change the time format as I wished, but my script has been running for 50 hours, and there is no signs of finishing.

Is there a better way to do this task? A MongoDB shell script or Python script is also doable to me.

By the way, this collection is not indexed since it's continuously inserting documents

Upvotes: 1

Views: 142

Answers (1)

Neo-coder
Neo-coder

Reputation: 7840

Using mongo bulk update you can changed date to ISODATE format as below :

var bulk = db.collectionName.initializeOrderedBulkOp();
var counter = 0;
db.collectionName.find().forEach(function(data) {
    var updoc = {
      "$set": {}
    };
    var myKey = "created_at";
    updoc["$set"][myKey] = new Date(Date.parse(data.created_at));
    // queue the update
    bulk.find({
      "_id": data._id
    }).update(updoc);
    counter++;
    // Drain and re-initialize every 1000 update statements
    if(counter % 1000 == 0) {
      bulk.execute();
      bulk = db.collectionName.initializeOrderedBulkOp();
    }
  })
  // Add the rest in the queue
if(counter % 1000 != 0) bulk.execute(); 

Upvotes: 2

Related Questions