Michael Draper
Michael Draper

Reputation: 1968

Mongo Query to Return only a subset of SubDocuments

Using the example from the Mongo docs:

{ _id: 1, results: [ { product: "abc", score: 10 }, { product: "xyz", score: 5 } ] }
{ _id: 2, results: [ { product: "abc", score: 8 }, { product: "xyz", score: 7 } ] }
{ _id: 3, results: [ { product: "abc", score: 7 }, { product: "xyz", score: 8 } ] }

db.survey.find(
   { id: 12345, results: { $elemMatch: { product: "xyz", score: { $gte: 6 } } } }
)

How do I return survey 12345 (regardless of even if it HAS surveys or not) but only return surveys with a score greater than 6? In other words I don't want the document disqualified from the results based on the subdocument, I want the document but only a subset of subdocuments.

Upvotes: 3

Views: 1950

Answers (3)

Sachin
Sachin

Reputation: 2932

You can use $filter in mongoDB 3.2

db.survey.aggregate([{
        $match: {
             { id: 12345}
        }
    }, {
        $project: {
            results: {
                $filter: {
                    input: "$results",
                    as: "results",
                    cond:{$gt: ['$$results.score', 6]}
                }
            }
        }
    }]);

It will return all the sub document that have score greater than 6. If you want to return only first matched document than you can use '$' operator.

Upvotes: 3

Alan Bogu
Alan Bogu

Reputation: 785

You can use $redact in this way:

db.survey.aggregate( [ 
  { $match : { _id : 12345 }},
  { $redact: {
     $cond: {
        if: { 
          $or: [ 
            { $eq: [ "$_id", 12345 ] },
            { $and: [ 
              { $eq: [ "$product", "xyz" ] }, 
              { $gte: [ "$score", 6 ] }
            ]}
          ] 
        },
        then: "$$DESCEND",
        else: "$$PRUNE"
      }
    }
  }
] );

It will $match by _id: 12345 first and then it will "$$PRUNE" all the subdocuments that don't have "product":"xyz" and don't have score greater or equal 6. I added the condition ($cond) { $eq: [ "$_id", 12345 ] } so that it wouldn't prune the whole document before it reaches the subdocuments.

Upvotes: 2

Blakes Seven
Blakes Seven

Reputation: 50436

What you are asking for is not so much a "query" but is basically just a filtering of content from the array in each document.

You do this with .aggregate() and $project:

db.survey.aggregate([
    { "$project": {
        "results": {
            "$setDifference": [
                { "$map": {
                    "input": "$results",
                    "as": "el",
                    "in": {
                        "$cond": [
                            { "$and": [
                                { "$eq": [ "$$el.product", "xyz" ] },
                                { "$gte": [ "$$el.score", 6 ] }
                            ]}
                        ]
                    }
                }},
                [false]
            ]
        }
    }}
])

So rather than "contrain" results to documents that have an array member matching the condition, all this is doing is "filtering" the array members out that do not match the condition, but returns the document with an empty array if need be.

The fastest present way to do this is with $map to inspect all elements and $setDifference to filter out any values of false returned from that inspection. The possible downside is a "set" must contain unique elements, so this is fine as long as the elements themselves are unique.

Future releases will have a $filter method, which is similar to $map in structure, but directly removes non-matching results where as $map just returns them ( via the $cond and either the matching element or false ) and is then better suited.

Otherwise if not unique or the MongoDB server version is less than 2.6, you are doing this using $unwind, in a non performant way:

db.survey.aggregate([
    { "$unwind": "$results" },
    { "$group": {
        "_id": "$_id",
        "results": { "$push": "$results" },
        "matched": {
            "$sum": {
                "$cond": [
                    { "$and": [
                        { "$eq": [ "$results.product", "xyz" ] },
                        { "$gte": [ "$results.score", 6 ] }

                    ]},
                    1,
                    0
                ]
            }
        }
    }},
    { "$unwind": "$results" },
    { "$match": {
        "$or": [
            {
                "results.product": "xyz",
                "results.score": { "$gte": 6 }
            },
            { "matched": 0 }
    }},
    { "$group": {
        "_id": "$_id",
        "results": { "$push": "$results" },
        "matched": { "$first": "$matched" }
    }},
    { "$project": {
        "results": { 
            "$cond": [
                { "$ne": [ "$matched", 0 ] },
                "$results",
                []
            ]
        }
    }}
])

Which is pretty horrible in both design and perfomance. As such you are probably better off doing the filtering per document in client code instead.

Upvotes: 4

Related Questions