Atul Prasad
Atul Prasad

Reputation: 33

Get distinct values from array based on conditions within the array

I have a collection with documents like this:

{
"_id" : ObjectId("59df91dae4b03289d79efb4e"),
  "name" : "TEST",
  "payload" : [ 
    {
      "_id" : "ABC",
      "status": "FALSE"
    },
    {
      "_id":"DEF",
     "status": "TRUE"
    },
    {
      "_id" : "XYZ",
     "status": "NULL"
    }
  ]
}

I am trying to find all the payload._id which have status as TRUE/FALSE. Expected result is

["ABC","DEF"]

So far, I have something like this.

db.collection.distinct('payload._id',{"_id" : "TEST"})

Not sure how to add conditions for elements within array.

Upvotes: 2

Views: 2946

Answers (2)

Neil Lunn
Neil Lunn

Reputation: 151210

Query conditions with .distinct() apply to "document selection" and not the array entries contained "within" the document. If you need to "filter" array content then you apply .aggregate() instead, as well as a little post processing to get only the "values" in the array response.

db.collection.aggregate([
  { "$match": { "_id": "TEST" } },
  { "$unwind": "$payload" },
  { "$match": { "payload.status": { "$in": ["TRUE","FALSE"] } } },
  { "$group": { "_id": "$payload._id" } },
]).map( d => d._id );

The main parts there are the $unwind pipeline stage which you do primarily because you want the values from within the array to use later as the key to $group on. This essentially produces a new document for each array member, but each document only contains that array member. It's "denormalizing" for MongoDB structures that contain arrays.

The next thing is the following $match pipeline, which works like any query and only selects documents that match the conditions. Since all array members are now "documents" then non matching entries ( as documents ) get excluded. You could alternately use $filter to extract whilst still an array, but since we need $unwind for the next stage we may as well simply $match.

At this point you are only left with the array entries that match the conditions. The $group is to obtain "distinct" values, so typically you would do this over a wider selection than just a single document or anything where the values here are not already distinct. So this is really just keeping all the same behavior of .distinct() intact.

Finally, since the output of .aggregate() differs from the design of .distinct() in that it returns "documents" in results, we simply use the .map() method to process the cursor results and return only the "values" from the specific document property as an "array".

Upvotes: 3

Alberto Centelles
Alberto Centelles

Reputation: 1253

I am trying to find all the payload._id which have status as TRUE/FALSE

The query that you want to run is:

db.collection.distinct('payload._id', { 
  'payload.status': { $in: ["TRUE", "FALSE"] } 
})

A more verbose way to do it would be:

db.collection.distinct('payload._id', { 
  $or: [
    {'payload.status': "TRUE" },
    {'payload.status': "FALSE" }
  ]  
})

Upvotes: 2

Related Questions