Thomas
Thomas

Reputation: 2070

MongoDB: How to rename a field using regex

I have a field in my documents, that is named after its timestamp, like so:

{
    _id: ObjectId("53f2b954b55e91756c81d3a5"),
    domain: "example.com",
    "2014-08-07 01:25:08": {
        A: [
            "123.123.123.123"
        ],
        NS: [
            "ns1.example.com.",
            "ns2.example.com."
        ]
    }
}

This is very impractical for queries, since every document has a different timestamp. Therefore, I want to rename this field, for all documents, to a fixed name. However, I need to be able to match the field names using regex, because they are all different.

I tried doing this, but this is an illegal query.

db['my_collection'].update({}, {$rename:{ /2014.*/ :"201408"}}, false, true);

Does someone have a solution for this problem?

SOLUTION BASED ON NEIL LUNN'S ANSWER:

conn = new Mongo();
db = conn.getDB("my_db");

var bulk = db['my_coll'].initializeOrderedBulkOp();
var counter = 0;

db['my_coll'].find().forEach(function(doc) {

    for (var k in doc) {
            if (k.match(/^2014.*/) ) {
                print("replacing " + k)
                var unset = {};
                unset[k] = 1;
                bulk.find({ "_id": doc._id }).updateOne({ "$unset": unset, "$set": { WK1: doc[k]} });
                counter++;
            }

    }

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db['my_coll'].initializeOrderedBulkOp();
    }

});

if ( counter % 1000 != 0 )
    bulk.execute();

Upvotes: 1

Views: 1426

Answers (2)

Neil Lunn
Neil Lunn

Reputation: 151112

This is not a mapReduce operation, not unless you want a new collection that consists only of the _id and value fields that are produced from mapReduce output, much like:

    "_id": ObjectId("53f2b954b55e91756c81d3a5"), 
    "value": { 
        "domain": "example.com",
        ... 
    } 
}

Which at best is a kind of "server side" reworking of your collection, but of course not in the structure you want.

While there are ways to execute all of the code in the server, please don't try to do so unless you are really in a spot. These ways generally don't play well with sharding anyway, which is usually where people "really are in a spot" for the sheer size of records.

When you want to change things and do it in bulk, you generally have to "loop" the collection results and process the updates while having access to the current document information. That is, in the case where your "update" is "based on" information already contained in fields or structure of the document.

There is therefore not "regex replace" operation available, and there certainly is not one for renaming a field. So let's loop with bulk operations for the "safest" form of doing this without running the code all on the server.

var bulk = db.collection.initializeOrderedBulkOp();
var counter = 0;

db.collection.find().forEach(function(doc) {

    for ( var k in doc ) {
        if ( doc[k].match(/^2014.*/) ) {
            var update = {};
            update["$unset"][k] = 1;
            update["$set"][ k.replace(/(\d+)-(\d+)-(\d+).+/,"$1$2$3") ] = doc[k];
            bulk.find({ "_id": doc._id }).updateOne(update);
            counter++;
        }
    }

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db.collection.initializeOrderedBulkOp();
    }

});

if ( counter % 1000 != 0 )
    bulk.execute();

So the main things there are the $unset operator to remove the existing field and the $set operator to create the new field in the document. You need the document content to examine and use both the "field name" and "value", so hence the looping as there is no other way.

If you don't have MongoDB 2.6 or greater on the server then the looping concept still remains without the immediate performance benefit. You can look into things like .eval() in order to process on the server, but as the documentation suggests, it really is not recommended. Use with caution if you must.

Upvotes: 3

Philipp
Philipp

Reputation: 69663

As you already recognized, value-keys are indeed very bad for the MongoDB query language. So bad that what you want to do doesn't work.

But you could do it with a MapReduce. The map and reduce functions wouldn't do anything, but the finalize function would do the conversion in Javascript.

Or you could write a little program in a programming language of your which reads all documents from the collection, makes the change, and writes them back using collection.save.

Upvotes: 0

Related Questions