JAC2703
JAC2703

Reputation: 423

MongoDB aggregation including missing documents

I have some documents stored in a MongoDB similar to the following:

{
    "id": "0001",
    "name": "Joe",
    "last_name": "Blogs",
    "results":
        [
            { "id": "5001", "mark": "78" },
            { "id": "5002", "mark": "105" },
            { "id": "5005", "mark": "97" },
        ]
}

There are many entries similar to the above. The problem I have is that some "results" fields are missing entirely so some entries look like below:

{
    "id": "0001",
    "name": "Joe",
    "last_name": "Blogs"
}

I'm trying to aggregate this to return the "name", "last_name" and an average "mark". I was getting on well with the below code:

db.sites.aggregate( 
 { $project : { name : "$name", last_name : "$last_name", results : "$results" } },
 { $unwind: "$results" },
 { $group : { 
    _id : "$_id", 
    average_mark : { $avg : "$results.mark" }, 
    name : { $last : "$name" },
    last_name : { $last : "$last_name" },
 }
)

However, it does not return an entry where there is no "results" array.

Does anyone have any idea how to return the missing "results" entries and set the average to zero?

Cheers,

James.

Upvotes: 1

Views: 2762

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151092

When the aggregation framework encounters an "empty" array, the result of an $unwind effectively removes the document from the pipeline, as it is considered that there would be no results. Also if the array is not present at all, then an error would be produced as the operator would try to access an element that does not exist.

For this reason, where you do not wish to explicitly "filter" those documents from your aggregation results, the $ifNull operator exists. It tests for the presence of a field an either returns that field or the alternate supplied argument.

db.sites.aggregate([
    { "$project": { 
        "name": 1, 
        "last_name": 1,
        "results" : { 
            "$ifNull": [ "$results", [{ "mark": 0 }] ]
        }
    }},
    { "$unwind": "$results" },
    { "$group": { 
        "_id" : "$_id", 
        "average_mark": { "$avg": "$results.mark" }, 
        "name": { "$last": "$name" },
        "last_name": { "$last": "$last_name" },
    }}
])

Similar results can be achieved with the $cond operator when you truly have an "empty" array, and even in combination:

    "results": { 
        "$cond": [ 
            {"$eq": [ "$results", [] ] },
            [{ "mark": 0 }],
            { "$ifNull": [ "$results", [{ "mark": 0 }] ] }
        ]
    }

Upvotes: 7

Related Questions