Sam
Sam

Reputation: 46

Add new item in nested array

I need to update an embedded Fields array through Mongo console. I need to add a new item as "Test13" with default value as "1" in all the Fields subdocuments.

Here is a sample of my collection with name testmarket:

{ 
    "_id" : ObjectId("573c9801056848fef667bfde"), 
    "MST" : "RR", 
    "GST" : null, 
    "Fields" : [{
        "Test1" : "boolean", 
        "Test2" : "TestLot", 
        "Test3" : "TestLot", 
        "Test4" : null, 
        "Test5" : true, 
        "Test6" : true, 
        "Test7" : NumberInt(1), 
        "Test8" : false, 
        "Test9" : false, 
        "Test10" : false, 
        "Test11" : false, 
        "Test12" : null
     }, {
        "Test1" : "String", 
        "Test2" : "TestSerial", 
        "Test3" : "TestSerial", 
        "Test4" : null, 
        "Test5" : true, 
        "Test6" : true, 
        "Test7" : NumberInt(1), 
        "Test8" : false, 
        "Test9" : false, 
        "Test10" : false, 
        "Test11" : false, 
        "Test12" : null
    }]
}

Collection is very big and I need to add a new item in Fields collection.

I have tried running the query

db.testmarket.Fields.update(
    {}, 
    {$set : {"Test13":"1"}}, 
    {upsert:true, multi:true}
) 

Upvotes: 3

Views: 2750

Answers (4)

chridam
chridam

Reputation: 103305

You essentially need to grab a set of all the Fields arrays in your collection, iterate the collection and for each document iterate the Fields subset. For each Fields element update the collection using the $ positional operator.

For example, within the inner loop (since you have 2 nested loops), your update for the first Fields array element would be

db.testmarket.updateOne(
    {
        "_id" : ObjectId("573c9801056848fef667bfde"),
        "Fields.Test1" : "boolean",
        "Fields.Test2" : "TestLot",
        "Fields.Test3" : "TestLot",
        "Fields.Test4" : null,
        "Fields.Test5" : true,
        "Fields.Test6" : true,
        "Fields.Test7" : 1,
        "Fields.Test8" : false,
        "Fields.Test9" : false,
        "Fields.Test10" : false,
        "Fields.Test11" : false,
        "Fields.Test12" : null
    },
    { "$set": { "Fields.$.Test13": "1" } }
);

and the next update in the iteration would be:

db.testmarket.updateOne(
    {
        "_id" : ObjectId("573c9801056848fef667bfde"),
        "Fields.Test1" : "String",
        "Fields.Test2" : "TestSerial",
        "Fields.Test3" : "TestSerial",
        "Fields.Test4" : null,
        "Fields.Test5" : true,
        "Fields.Test6" : true,
        "Fields.Test7" : 1,
        "Fields.Test8" : false,
        "Fields.Test9" : false,
        "Fields.Test10" : false,
        "Fields.Test11" : false,
        "Fields.Test12" : null
    },
    { "$set": { "Fields.$.Test13": "1" } }
);

and so forth.

This algorithm can be implemented as follows:

db.testmarket.find().snapshot().forEach(function(doc) {
    doc.Fields.forEach(function(field) {
        var query = {},
            obj = { Fields: field };

        /* function to convert an object into dot notation */
        (function recurse(obj, current) {
            for(var key in obj) {
                var value = obj[key];
                var newKey = (current ? current + "." + key : key);  /* joined key with dot */
                if(value && typeof value === "object") {
                    recurse(value, newKey);  // it's a nested object, so do it again
                } else {
                    query[newKey] = value;  // it's not an object, so set the property
                }
            }
        })(obj);

        query["_id"] = doc._id;

        /* do the update */
        db.testmarket.updateOne(
            query,
            { "$set": { "Fields.$.Test13": "1" } }
        );
    });
});

As you can see from the above, performance is bound to be compromised since you have double nested for loop. The outer loop for the first iteration i.e. iterating the whole collection executes n times where n is the total number of documents in the collection.

For each iteration of the outer loop, the inner loop gets executed i times where i is the length of the Fields array, so the overall complexity can be calculated as follows: one for the first iteration plus two for the second iteration plus three for the third iteration and so on, plus n for the n-th iteration.

1+2+3+4+5+...+n = (n*(n-1))/2 --> O(n^2)

Doing the updates in a nested loop with O(n^2) complexity for very large collections is not very efficient. In that regard, you can optimise your code by taking advantage of the Bulk API which allows you to send the updates as streamlined batches i.e. instead of sending each update request in to the server with each iteration, you can bulk up the update operations into a single request which is faster and more efficient. The following shows how you can leverage the updates using bulkWrite() method.

For MongoDB version 3.0 and below:

var bulk = db.testmarket.initializeUnOrderedBulkOp(),
    counter = 0;

db.testmarket.find().snapshot().forEach(function(doc) {
    doc.Fields.forEach(function(field) {
        var query = {},
            obj = { Fields: field };

        /* function to convert an object into dot notation */
        (function recurse(obj, current) {
            for(var key in obj) {
                var value = obj[key];
                var newKey = (current ? current + "." + key : key);  /* joined key with dot */
                if(value && typeof value === "object") {
                    recurse(value, newKey);  // it's a nested object, so do it again
                } else {
                    query[newKey] = value;  // it's not an object, so set the property
                }
            }
        })(obj);

        query["_id"] = doc._id;

        /* load up update operations */
        bulk.find(query).updateOne({ "$set": { "Fields.$.Test13": "1" } });

        counter++;

        /* execute the update operations at once in bulk */
        if (counter % 500 === 0 ) {
            bulk.execute();
            bulk = db.testmarket.initializeUnOrderedBulkOp();
        }
    });
});

/* clean up the remaining update operations left in the queue */
if (counter % 500 !== 0 )
    bulk.execute();

MongoDB 3.2 or newer:

var ops = [];

 db.testmarket.find().snapshot().forEach(function(doc) {
    doc.Fields.forEach(function(field) {
        var query = {},
            obj = { Fields: field };

        /* function to convert an object into dot notation */
        (function recurse(obj, current) {
            for(var key in obj) {
                var value = obj[key];
                var newKey = (current ? current + "." + key : key);  /* joined key with dot */
                if(value && typeof value === "object") {
                    recurse(value, newKey);  // it's a nested object, so do it again
                } else {
                    query[newKey] = value;  // it's not an object, so set the property
                }
            }
        })(obj);

        query["_id"] = doc._id;

        /* load up update operations */
        ops.push({
            "updateOne": {
                "filter": query,
                "update": { "$set": { "Fields.$.Test13": "1" } }
            }
        });
        counter++;      
    });

    if (counter % 500 === 0) {
        db.testmarket.bulkWrite(ops);
        ops = [];
    }
});


if (counter % 500 !== 0) 
    db.testmarket.bulkWrite(ops);

The counter variable above is there to manage your bulk updates effectively if your collection is large. It allows you to batch the update operations and sends the writes to the server in batches of 500 which gives you a better performance as you are not sending every request to the server, just once in every 500 requests.

For bulk operations MongoDB imposes a default internal limit of 1000 operations per batch and so the choice of 500 documents is good in the sense that you have some control over the batch size rather than let MongoDB impose the default, i.e. for larger operations in the magnitude of > 1000 documents.

Upvotes: 1

Dineshaws
Dineshaws

Reputation: 2083

Here is the answer:

db.testmarket.find().toArray().forEach(
  function(obj){
    for(var i = 0; i < obj.Fields.length; ++i) { 
        obj.Fields[i]['Test13'] = '1';
    }
    db.testmarket.update({_id: obj._id}, obj);
  }
);

Thanks

Upvotes: 1

Rahul Kumar
Rahul Kumar

Reputation: 2831

db.testmarket.find({}).forEach(function(x){for(var i in x.Fields){x.Fields[i]['Test13']=1} db.testmarket.save(x);});

I don't think you can do it with update, update call at best will update any single item from the document, though it will do it for all the document with multi:true but not all the items per document in all the document.

In case if your collection is too large, you can call above method in batches with the use of limit and skip.

Upvotes: 0

Riya Saxena
Riya Saxena

Reputation: 99

db.getCollection('testmarket').update({"_id":ObjectId(573c9801056848fef667bfde)},{'$addToSet':{'Fields':{"Test13":1}}}); But I think you should give some name to your jsons inside array as the above code with simply insert a json{Test13:1} outside all the jsons. If you will change your structure like this Fields:[{"firstTest":{"Test1":"boolean",......},{"secondTest":{"Test1":"boolean",......}}}] then the above code can be rewritten as->

db.getCollection('testmarket').update({"_id":ObjectId(573c9801056848fef667bfde)},{'$addToSet':{'Fields.firstTest':{"Test13":1}}});

Upvotes: 0

Related Questions