Krunal Sonparate
Krunal Sonparate

Reputation: 1142

How to find consecutive documents with count in MongoDB?

I have collection of messages received from users. I want to group it by consecutive sequence of userName sorted by timestamp. I have collection messages as mentioned below:

{
    "_id":ObjectId("AAAA")
    "userName":"Krunal"
    "message":"Krunal types some text",
    "timestamp":ISODate("2019-06-17T11:57:00.000")
}
{
    "_id":ObjectId("AAAB")
    "userName":"Krunal"
    "message":"Krunal types some text again",
    "timestamp":ISODate("2019-06-17T11:59:00.000")
}
{
    "_id":ObjectId("AAAC")
    "userName":"Krunal"
    "message":"Krunal types some text one more time",
    "timestamp":ISODate("2019-06-17T12:05:00.000")
}
{
    "_id":ObjectId("AAAD")
    "userName":"Karan"
    "message":"Karan type some text",
    "timestamp":ISODate("2019-06-17T12:07:00.000")
}
{
    "_id":ObjectId("AAAE")
    "userName":"Karan"
    "message":"Karan type some more text",
    "timestamp":ISODate("2019-06-17T12:10:00.000")
}
{
    "_id":ObjectId("AAAC")
    "userName":"Krunal"
    "message":"Krunal types some text one more time",
    "timestamp":ISODate("2019-06-17T12:12:00.000")
}

I am using 4 bytes object id to make it easy to read, In real scenario it will be actual object id generated by mongodb From above collection i want output as mentioned below:

{
    "userName":"Krunal",
    "count":3,
    "timestamp":ISODate("2019-06-17T12:05:00.000")
}
{
    "userName":"Karan",
    "count":2,
    "timestamp":ISODate("2019-06-17T12:10:00.000")
}
{
    "userName":"Krunal",
    "count":1,
    "timestamp":ISODate("2019-06-17T12:12:00.000")
}

I want to count of consecutive messages from user with userName Is there any query available in mongodb to use or i need to write separate algorithm after simple find query?

EDIT: I don't want to group by userName only. i want is to grouped by userName with consecutive documents. For example, considering above mentioned collection. Krunal has sent 3 messages consecutively so Krunal: 3, then Karan has sent 2 messages consecutively so Karan: 2, Now Krunal has sent one more message but after karan so it will be new object as Krunal: 1, It will not increment the previous count of Krunal

Upvotes: 4

Views: 1209

Answers (3)

Shubham Vaishnav
Shubham Vaishnav

Reputation: 1720

This is the solution I came up with.

It is not a single aggregate query but instead I am iterating over each entry using a forEach loop in a mongo script( Javascript file which can be executed with mongo).

var prev_name = "";
var count = 0;
var obj_dict = {};
var entries = [];
var prev_timestamp;

db.wikidata.find().forEach(function(entry) {
    var name = entry["userName"];
    if(prev_name === ""){
        count += 1;
        prev_name = name;
        prev_timestamp = entry["timestamp"];
    } else if(prev_name === name){
        count += 1;
        prev_timestamp = entry["timestamp"];
    } else {
        obj_dict["userName"] = prev_name;
        obj_dict["count"] = count;
        obj_dict["timestamp"] = prev_timestamp;
        entries.push(obj_dict);
        prev_name = name;
        count = 1;
        prev_timestamp = entry["timestamp"];
        obj_dict = {};
    }
})

obj_dict["userName"] = prev_name;
obj_dict["count"] = count;
obj_dict["timestamp"] = prev_timestamp;
entries.push(obj_dict);

print(JSON.stringify(entries));

Output:

[{"userName":"Krunal","count":3,"timestamp":"2019-06-17T12:05:00.000Z"},{"userName":"Karan","count":2,"timestamp":"2019-06-17T12:10:00.000Z"},{"userName":"Krunal","count":1,"timestamp":"2019-06-17T12:12:00.000Z"}]

Hope it helps.

Upvotes: 4

Rubin Porwal
Rubin Porwal

Reputation: 3845

db.collection.aggregate(

    // Pipeline
    [
        // Stage 1
        {
            $group: {
                _id: '$userName',
                count: {
                    $sum: 1
                },
                timestamp: {
                    $last: '$timestamp'
                }

            }
        },

        // Stage 2
        {
            $project: {
                userName: '$_id',
                count: 1,
                timestamp: 1,
                _id: 0
            }
        },

    ]



);

Upvotes: 0

KS Rajput
KS Rajput

Reputation: 203

Use this.

db.collection.aggregate([{"$group": {"_id":"$userName","count": {$sum:1},"timestamp": {"$first": "$$ROOT.timestamp"}}}])

Upvotes: 0

Related Questions