Cheshire Cat
Cheshire Cat

Reputation: 1961

MongoDB Add new field calculated from datetime and converted to int number

I'va a MongoDB collection (on MongoDB v3.2.12) with about 1 Billion records. The collection has a datetime key. I need to add a new key to the collection, derived from the datetime one, such as the value is "yyyymmdd" and save it as an integer.

For now with the following command I was able to create a duplicated collection with the new key as a string.

db.log.aggregate(
    [
        { "$addFields": { 
            "date_ref": { $dateToString: { format: "%Y%m%d", date: "$log_date" } } 
        }},
        { "$out": "log_test" }
    ]
)

Trying to add NumberInt() or parseInt() as following, didn't work.

db.log.aggregate(
    [
        { "$addFields": { 
            "date_ref": NumberInt( { $dateToString: { format: "%Y%m%d", date: "$log_date" } } )
        }},
        { "$out": "log_test" }
    ]
)

Using NumberInt() I get all values to 0. Using parseInt() I get all values to 'NAN'.

I know I can use a new key and a foreach statement like this:

db.log_test.find().forEach( function (l) {
  l.date_ref_int = parseInt(l.date_ref); 
  db.log_test.save(l);
});

But this approach is very, very expensive in terms of time. So, is it possible to do the conversion within the aggregate?

UPDATE

Another approach I tried was this:

db.log.aggregate(
    [
        { "$addFields": { 
            "date_ref_string": { $dateToString: { format: "%Y%m%d", date: "$log_date" } },
            "date_ref": NumberInt("date_ref_string") 
        }},
        { "$out": "log_test" }
    ]
)

I get no errors but the "date_ref" is always 0.

UPDATE 2

Following @Veeram suggestion I made this script:

db.log.aggregate(
    [
        { $addFields: 
            { "date_ref": 
                { $divide:
                    [
                        { $add:
                            [
                                { $multiply: [{"$year" : "$log_date"}, 10000000]}, 
                                { $multiply: [{"$month" : "$log_date"}, 100000]}, 
                                { $multiply: [{"$dayOfMonth" : "$log_date"}, 1000]}
                            ]
                        }, 1000
                    ]
                }
            }
        },
        { "$out": "log_test" }
    ]
);

It works, but I get back values as double. I prefer an integer, because I need to build an index on that new calculated key and it would be smaller if the key is integer type.

Upvotes: 0

Views: 1162

Answers (1)

s7vr
s7vr

Reputation: 75984

You can use below aggregation query.

db.log.aggregate([
  {"$addFields":{
    "date_ref":{
      "$trunc":{
        "$divide":[
          {"$add":[
            {"$multiply":[{"$year":"$log_date"},10000000]},
            {"$multiply":[{"$month":"$log_date"},100000]},
            {"$multiply":[{"$dayOfMonth":"$log_date"},1000]}
          ]},
          1000
        ]
      }
    }
  }}
])

Upvotes: 1

Related Questions