mmmm
mmmm

Reputation: 3938

Mongo mass update to lower case

I'm trying to set given fields to lowercased in collection ( using php ). I read Update MongoDB collection using $toLower and trying with db.myCollection.update({_id: e._id}, {$set: {UserName: e.UserName.toLowerCase() } ( in php it was $collection->update(['_id' => 'e._id'], ['$set' => [ 'field' => 'e.field.toLowerCase()']]); ) but it didn't worked.

Is there any other way or I've just made a mistake?

Upvotes: 1

Views: 3916

Answers (3)

Neil Lunn
Neil Lunn

Reputation: 151132

The general premise of looping is the correct approach as there is presently no way of referring to the data of an existing field in a document for this type of update operation.

You can however get a significant performance gain here by using the Bulk Operations API which is available from MongoDB 2.6 and greater. Modern driver releases all support these methods, so for current PHP:

 $client = new MongoClient();
 $collection = $client->selectCollection("database","collection");

 $batch = new MongoUpdateBatch($collection);
 $counter = 0;

 foreach ( $collection->find() as $doc ) {

     $batch->add(
         array(
             "q" => array( '_id' => $doc['_id'] ),
             "u" => array( 
                 '$set' => array(
                     'UserName' => strtolower($doc['UserName'])
                 )
             )
         )
     );
     $counter++;

     if ( $counter % 1000 === 0 ) {
         $retval = $batch->execute(array( 'w' => 1));
         $counter++;
         $batch = new MongoUpdateBatch($collection);        
     }
 }

 if ( $counter > 0 ) {
     $retval = $batch->execute(array( 'w' => 1 ));
 }

The class there extends the MongoWriteBatch class for update operations. Essentially, each query and update portion is added to the "batch" via the .add() method, and is only actually sent to the server on the call to .execute(). There is some management of the "size" of these operations implemented with the modulo operation, which should be generally handled by the driver but keeps this in manageable sizes should you want to check the write result.

The key here is that rather than waiting for the write response from the server for every single update, the operations are sent and responded to in "batches". The reduction in "overhead" here when performing mass updates is considerable as there is less "back and forth" when communicating with the server.

In the current release shell, all of the standard methods implement the bulk API methods "underneath" and only "fall back" to the legacy implementations when connecting to a server with a version below 2.6 where the API is not available.

So if you have to perform this sort of update, and you have MongoDB 2.6 available, then changing your code to do a loop with bulk operations as above will add a significant speed boost

Upvotes: 8

Ngo Quang Thuc
Ngo Quang Thuc

Reputation: 1

Try this on Mongo Shell:

db.collection.find().forEach(function(obj){

    obj.field= obj.field.toLowerCase();
    obj.save()

});

Upvotes: 0

Sammaye
Sammaye

Reputation: 43884

In PHP:

$mongo = new MongoClient();
foreach($mongo->db->collection->find() as $doc){
    $mongo->db->collection->update(
        ['_id' => $doc['_id']], 
        ['$set' => ['field' => strtolower($doc['field'])]]
    );
}

You can also use save() but update() might actually be faster here.

You cannot do this particular operation via one update() call currently ($toLower is actually for projection within the aggregation framework) as such that is why I am iterating a cursor of the collection.

Upvotes: 3

Related Questions