user3163545
user3163545

Reputation: 245

Mongodb Time Series operations and generation

I've got a Mongodb Collection with this kind of docs :

{
"_id" : ObjectId("53cb898bed4bd6c24ae07a9f"),
"account" : "C1"
"created_on" : ISODate("2014-10-01T01:23:00.000Z")
"value" : 253
}

and

{
"_id" : ObjectId("52cb898bed4bd6c24ae06a9e"),
"account" : "C2"
"created_on" : ISODate("2014-10-01T01:23:00.000Z")
"value" : 9381
}

There is a document every minutes for C1 and C2.

I would like to generate data for an other account "C0" which will be equal to : (C2 - C1)*0.25 So the aim is to generate data for every minutes in the collection.

According to you, is it possible to do that in mongo shell ?

Thank you very much :)

Upvotes: 2

Views: 105

Answers (1)

BatScream
BatScream

Reputation: 19700

The logic to solve this problem, is as below:

 a) group all the records by created_on date.
 b) get the value of both the documents in each group.
 c) calculate the difference the C2 and C1 documents for each group.
 d) In case one of the documents is missing difference 
    would be the value of the existing document.
 d) project a document with value as (difference*.25) in each group.
 e) insert the projected document to the collection.

I would like to propose two solutions to this, the first one would be on your assumption,

There is a document every minutes for C1 and C2.

So for every created_on time, there would be only two documents, C1 and C2.

db.time.aggregate([ {
    $match : {
        "account" : {
            $in : [ "C1", "C2" ]
        }
    }
}, {
    $group : {
        "_id" : "$created_on",
        "first" : {
            $first : "$value"
        },
        "second" : {
            $last : "$value"
        },
        "count" : {
            $sum : 1
        }
    }
}, {
    $project : {
        "_id" : 0,
        "value" : {
            $multiply : [ {
                $cond : [ {
                    $lte : [ "$count", 1 ]
                }, "$first", {
                    $subtract : [ "$first", "$second" ]
                } ]
            }, 0.25 ]
        },
        "created_on" : "$_id",
        "account" : {
            $literal : "C0"
        }
    }
} ]).forEach(function(doc) {
    doc.value = Math.abs(doc.value);
    db.time.insert(doc);
});

The second solution is based on real-time scenarios. For a particular created_on time, there can be 'n' number of C1 documents and 'm' number of C2 documents with different values, but we would need only one 'C0' document representing the differences, for that particular created_on time. You would need an extra $group pipeline operator as below:

db.time.aggregate([ {
    $match : {
        "account" : {
            $in : [ "C1", "C2" ]
        }
    }
}, {
    $group : {
        "_id" : {
            "created_on" : "$created_on",
            "account" : "$account"
        },
        "created_on" : {
            $first : "$created_on"
        },
        "values" : {
            $sum : "$value"
        }
    }
}, {
    $group : {
        "_id" : "$created_on",
        "first" : {
            $first : "$values"
        },
        "second" : {
            $last : "$values"
        },
        "count" : {
            $sum : 1
        }
    }
}, {
    $project : {
        "_id" : 0,
        "value" : {
            $multiply : [ {
                $cond : [ {
                    $lte : [ "$count", 1 ]
                }, "$first", {
                    $subtract : [ "$first", "$second" ]
                } ]
            }, 0.25 ]
        },
        "created_on" : "$_id",
        "account" : {
            $literal : "C0"
        }
    }
} ]).forEach(function(doc) {
    doc.value = Math.abs(doc.value);
    db.time.insert(doc);
});

Upvotes: 2

Related Questions