Sidharth Guglani
Sidharth Guglani

Reputation: 105

How to do mongoose aggregation with nested array documents

I have a Mongodb collection, Polls with following schema

{  
 "options" : [ 
               {   
                 "_id" : Object Id,  
                 "option" : String,  
                 "votes" : [ Object Id ] // object ids of users who voted
               },..... 
             ]   
}

Assume i have userId of the user in node js to whom I want to send this info. My task is to

(1) include an extra field in the above json object (which i get using mongoose).

as

"myVote" : option._id

I need to find option._id for which

options[someIndex].votes contains userId

(2) change the existing "votes" field in each option to represent number of votes on a particular option as can be seen in example

Example:

{  
  "options" : [ 
               {   
                 "_id" : 1,  
                 "option" : "A",  
                 "votes" : [ 1,2,3 ]
               },
               {   
                 "_id" : 2,  
                 "option" : "B",  
                 "votes" : [ 5 ]
               },
               {   
                 "_id" : 3,  
                 "option" : "C",  
                 "votes" : [  ]
               }
             ]   
}

So if i user with user id = 5 wants to see the poll, then i need to send following info:

Expected Result :

{  
  "my_vote" : 2,           // user with id 5 voted on option with id 2
  "options" : [ 
               {   
                 "_id" : 1,  
                 "option" : "A",  
                 "votes" : 3              //num of votes on option "A"
               },
               {   
                 "_id" : 2,  
                 "option" : "B",  
                 "votes" : 1             //num of votes on option "B"
               },
               {   
                 "_id" : 3,  
                 "option" : "C",  
                 "votes" : 0            //num of votes on option "C"
               }
             ]   
}

Upvotes: 1

Views: 1706

Answers (2)

user3561036
user3561036

Reputation:

Since it was the question that you actually asked that was neither really provided in the current acceptance answer, and also that it does some unnecessary things, there is another approach:

var userId = 5; // A variable to work into the submitted pipeline

db.sample.aggregate([
    { "$unwind": "$options" },
    { "$group": {
        "_id": "$_id",
        "my_vote": { "$min": {
            "$cond": [
                { "$setIsSubset": [ [userId], "$options.votes" ] },
                "$options._id",
                false
            ]
        }},
        "options": { "$push": {
            "_id": "$options._id",
            "option": "$options.option",
            "votes": { "$size": "$options.votes" }
        }}
    }}
])

Which of course will give you output per document like this:

{
    "_id" : ObjectId("5573a0a8b67e246aba2b4b6e"),
    "my_vote" : 2,
    "options" : [
            {
                    "_id" : 1,
                    "option" : "A",
                    "votes" : 3
            },
            {
                    "_id" : 2,
                    "option" : "B",
                    "votes" : 1
            },
            {
                    "_id" : 3,
                    "option" : "C",
                    "votes" : 0
            }
    ]
}

So what you are doing here is using $unwind in order to break down the array for inspection first. The following $group stage ( and the only other stage you need ) makes use of the $min and $push operators for re-construction.

Inside each of those operations, the $cond operation tests the array content via $setIsSubset and either returns the matched _id value or false. When reconstructing the inner array element, specify all elements rather than just the top level document in arguments to $push and make use of the $size operator to count the elements in the array.

You also make mention with a link to another question about dealing with an empty array with $unwind. The $size operator here will do the right thing, so it is not required to $unwind and project a "dummy" value where the array is empty in this case.


Grand note, unless you are actually "aggregating" across documents it generally would be advised to do this operation in client code rather than the aggregation framework. Using $unwind effectively creates a new document in the aggregation pipeline for each element of the array contained in each document, which produces significant overhead.

For such an operation acting on distinct documents only, client code is more efficient to process each document individually.


If you really must persist that server processing is the way to do this, then this is probably most efficient using $map instead:

db.sample.aggregate([
    { "$project": {
        "my_vote": {
            "$setDifference": [
                { "$map": {
                    "input": "$options",
                    "as": "o",
                    "in": { "$cond": [
                        { "$setIsSubset": [ [userId], "$$o.votes" ] },
                        "$$o._id",
                        false
                    ]}
                }},
                [false]
            ]
        },
        "options": { "$map": {
            "input": "$options",
            "as": "o",
            "in": {
                "_id": "$$o._id",
                "option": "$$o.option",
                "votes": { "$size": "$$o.votes" }
            }
        }}
    }}
])

So this just "projects" the re-worked results for each document. The my_vote is not the same though, since it is a single element array ( or possible multiple matches ) that the aggregation framework lacks the operators to reduce to a non array element without further overhead:

{
    "_id" : ObjectId("5573a0a8b67e246aba2b4b6e"),
    "options" : [
            {
                    "_id" : 1,
                    "option" : "A",
                    "votes" : 3
            },
            {
                    "_id" : 2,
                    "option" : "B",
                    "votes" : 1
            },
            {
                    "_id" : 3,
                    "option" : "C",
                    "votes" : 0
            }
    ],
    "my_vote" : [
            2
    ]
}

Upvotes: 3

Sze-Hung Daniel Tsui
Sze-Hung Daniel Tsui

Reputation: 2332

Check out this question.

It's not asking the same thing, but there's no way to do what you're asking without multiple queries anyway. I would modify the JSON you get back directly, as you're just displaying extra info that is already contained in the result of the query.

  1. Save the userID you're querying for.
  2. Take the results of your query (options array in an object), search through the votes of each element in the array.
  3. When you've found the right vote, attach the _id (perhaps add 'n/a' if you don't find a vote).

Write a function that does 2 and 3, and you can just pass it a userID, and get back a new object with myVote attached.

I don't think doing it like this will be slower than doing another query in Mongoose.

Upvotes: 0

Related Questions