enavuio
enavuio

Reputation: 1500

How to replace specific fields in mongodb but return the rest

I am not sure the best way to design this. I have a collection where I want to reinsert into a new collection but I want to keep the structure exactly the same minus a few fields.

{
_id: "123",
date: "1900-01-01T11:00:00.0000000",
name: "joe",
birthday: "1999-01-01"
}

This could contain other fields that I do not know about - but I know for a fact I want to convert all dates to be ISO Date type - so I only want to be specific to only 4-5 fields out of the possible 20+ fields.

Example of final:

{
_id: "123",
date: ISO_Date("1900-01-01T11:00:00.0000000"),
name: "joe",
birthday: ISO_Date("1999-01-01T11:00:00.0000000")
}

I was thinking I can create a few new fields with the add option like so:

db.collection.aggregate([
//add
{
    $addFields: {
       convertDate: ISO_date(date) ,
       convertBirthdate: ISO_date(date)
     }
},
//
{
    //stuck here
    //select all
    //replace the date strings with the new 
 $project : 
    { 
      //select * but avoid writing each field out as I could miss a few
      *, 
      //replace date with addfields
      date : $convertDate,  
      //replace birthday
      birthday: $convertBirthdate
    } 
}
])

Please let me know if this is possible or another more efficient way to do it.

Upvotes: 3

Views: 3519

Answers (2)

Buzz Moschetti
Buzz Moschetti

Reputation: 7578

Here's a one-stage variation that walks the documents attempting to convert any string it finds to a date and if it cannot, it sets it back to the original value and type. The $merge or $out would be the same as other answers here. This is not a nuanced solution but if you know your datestrings are good and nothing else smells like a datestring, then it could be useful. It is worth noting, however, that if you intend to do a mass conversion as a one-off (or at least rarely and you have millions of documents), it is likely much more efficient to dump the material to disk, process it externally, then use mongoimport to bulk re-load the material with multiple threads (the -j option).

db.foo.aggregate([
    {$replaceRoot: {newRoot: {$arrayToObject: {$map: {
        input: {$objectToArray: "$$CURRENT"},
        as:    "z",
        in:    {k:"$$z.k",
                v:{$cond:[{$ne:["string",{$type:"$$z.v"}]},"$$z.v",
                      {$convert: {input: "$$z.v", to: "date", onError: "$$z.v"}}]}
               }
        }} }
    }}
]);

Here is the solution expanded with comments to explain what is going on.

db.foo.aggregate([
    // Turn the object into a k-v array named X:
    {$project: {X: {$objectToArray: "$$CURRENT"}}}

    //  Use map to walk the X array.  For each k-v encountered, if the
    //  type of v ($$z.v) is NOT a string, set v = $$z.v (itself), else
    //  use the $convert function to try to make it a date.  If THAT fails,
    //  then convert will use the old $$z.v value.
    ,{$addFields: {Z: {$map: {
        input: "$X",
        as:    "z",
            in:    {k:"$$z.k",
                v:{$cond:{if: {$ne:["string",{$type:"$$z.v"}]},
                          then: "$$z.v",
                          else: {$convert: {input: "$$z.v", to: "date", onError: "$$z.v"}} }}
                   }
                 }}
    }}

    // Turn Z from a k-v array back into an object and "lift" it into
    // the root document.
    ,{$replaceRoot: {newRoot: {$arrayToObject: "$Z"}}}
]);

Here is the bigger brother that uses the $function operator from version 4.4. It will recursively walk a document and sniff out Date-like strings and convert them. It is highly recommended to add an $out stage at the end of the pipeline followed by some testing because this can both change many things or miss certain less popular datestring formats.

{$replaceRoot: {newRoot: {$function: {
        body: function(obj) {
            var process = function(holder, spot, value) {
                if(Array.isArray(value)) { // test FIRST since [] instanceof Object is true!           
                    for(var jj = 0; jj < value.length; jj++) {
                        process(value, jj, value[jj]);
                    }
                } else if(value instanceof Object) {
                    walkObj(value);
                } else {
                    if('string' === typeof value) {
                        q = Date.parse(value);
                        if(!isNaN(q)) {
                            holder[spot] = new Date(q);
                        }
                    }
                }
            };
            var walkObj = function(obj) {
                Object.keys(obj).forEach(function(k) {
                    process(obj, k, obj[k]);
                });
            }

            walkObj(obj);
            return obj;
        },
        args: [ "$$CURRENT" ],
        lang: "js"
    }}
                   }}

Upvotes: 1

ray
ray

Reputation: 15227

Your direction is correct. You can simply reuse the same field names in $addFields. For inserting into another new collection, you may try to use $merge or simply $out.

db.collection.aggregate([
  {
    "$addFields": {
      "date": {
        "$toDate": "$date"
      },
      "birthday": {
        "$toDate": "$birthday"
      }
    }
  },
  {
    "$merge": {
      "into": "collection2",
      "on": "_id",
      "whenMatched": "merge",
      "whenNotMatched": "insert"
    }
  }
])

Here is the Mongo playground for your reference.

Upvotes: 5

Related Questions