How to return only the first occurrence of an ID with Mongoose?

Question

I have a large collection called Messages, like so:

{
   user: 94fg844f,
   event: null,
   group: null,
   name: "Jake",
   text: "Hello world"
}, {
   user: fje93jrg4,
   event: null,
   group: null,
   name: "Bob"
   text: "Testing"
}, {
   user: fje93jrg4,
   event: null,
   group: null,
   name: "Bob"
   text: "Text here"
}, {
   user: null,
   event: d0j3n9fn3,
   group: null,
   name: "My Event"
   text: "Testing 2"
}, {
   user: null,
   event: d0j3n9fn3,
   group: null,
   name: "My Event"
   text: "Another text"
}

I need to get the first occurrence of the users, events and groups.

For example, since user fje93jrg4 occurs twice, I just want to get the document back with the text of Testing since the one with the text of Text here is older than it. Same goes with the event of d0j3n9fn3. It occurs twice, although I just want to get the first document back of it with the text of Testing 2.

I looked into distinct although it seems to only support one search term, like user instead of user, event and group.

The final result of the above would be:

{
   user: 94fg844f,
   event: null,
   group: null,
   name: "Jake",
   text: "Hello world"
}, {
   user: fje93jrg4,
   event: null,
   group: null,
   name: "Bob"
   text: "Testing"
}, , {
   user: null,
   event: d0j3n9fn3,
   group: null,
   name: "My Event"
   text: "Testing 2"
}

My guess is that I'll probably have to use an aggregate with $first or something along those lines. The problem with doing 3 different queries is that I need to apply a limit so that I always get 10 results back. For example, there could be no recent groups in the mix, just events and users.

Sede · Accepted Answer

We can use the aggregation framework to do this. First we need to $sort by user and "_id". From there, we then $group by "user" and use the $last accumulator operator to return the last document for each user. Note that we can also use the $first accumulator operator if we sort our documents in descending order, but sorting in ascending order and using $last make our intention clear.

db.collection.aggregate([
    { "$sort": { "user": 1, "_id": -1 } }, 
    { "$group": { 
        "_id": "$user", 
        "user": { "$last": "$$ROOT" } 
    }} 
])

which produces:

{
    "_id" : "fje93jrg4",
    "user" : {
        "_id" : 2,
        "user" : "fje93jrg4",
        "event" : null,
        "group" : null,
        "name" : "Bob",
        "text" : "Testing"
    }
}
{
    "_id" : "94fg844f",
    "user" : {
        "_id" : 1,
        "user" : "94fg844f",
        "event" : null,
        "group" : null,
        "name" : "Jake",
        "text" : "Hello world"
    }
}
{
    "_id" : null,
    "user" : {
        "_id" : 4,
        "user" : null,
        "event" : "d0j3n9fn3",
        "group" : null,
        "name" : "My Event",
        "text" : "Testing 2"
    }
}

We may want to add a $project to our pipeline but doing so will cause a drop of performance. However it will reduce both the amount of data sent over the wire and the time and memory used to decode documents on the client side if do not need all of the key/value pairs in a document returned.

The $project stage look like this:

{ "$project": {
    "_id": "$user._id",
    "user": "$user.user",
    "event": "$user.event",
    "group": "$user.group",
    "name": "$user.name",
    "text": "$user.text"
}}

How to return only the first occurrence of an ID with Mongoose?

Answers (2)

Related Questions