jones
jones

Reputation: 1453

How to compare an Array, and Count Matched Items

I have a collection that has the following documents:

 {
   _id: ObjectId("000000000000000000059734"),
   locations: ["A", "B", "C"]  
 },

 {
   _id: ObjectId("000000000000000000059735"),
   locations: ["A", "D", "K"]  
 },

 {
   _id: ObjectId("000000000000000000059736"),
   locations: ["1", "3", "C"]  
 }

Now what I want is to count the total of documents based on the following array items:

let array = ['A', 'B', '1'];

my desired result is:

{
  'A': 2,
  'B': 1,
  '1': 1
}

What I have tried:

db.getCollection('mycollection').aggregate([
  {$group: {
   "_id": { 
       "location": { 
        "A": { "$sum": { "$cond": [{ "$in": [ "A", "$locations" ] },1,0] } },
        "B": { "$sum": { "$cond": [{ "$in": [ "B", "$locations" ] },1,0] } },     
        "1": { "$sum": { "$cond": [{ "$in": [ "1", "$locations" ] },1,0] } },
       }
    }    
  }}
])

But my query result format is not the same thing I want.

For any help and guides thanks.

Upvotes: 1

Views: 50

Answers (2)

Rubin Porwal
Rubin Porwal

Reputation: 3845

db.mycollection.aggregate(

    // Pipeline
    [
        // Stage 1
        {
            $unwind: {
                path: '$locations'
            }
        },

        // Stage 2
        {
            $match: {
                locations: {
                    $in: ['A', 'B', '1']
                }
            }
        },

        // Stage 3
        {
            $group: {
                _id: '$locations',
                total: {
                    $sum: 1
                }
            }
        }

    ]


);

Upvotes: 0

Neil Lunn
Neil Lunn

Reputation: 151112

If you have MongoDB 3.4.4 at least, then you can do something like this:

var array = ['A', 'B', '1'];

db.getCollection('mycollection').aggregate([
  { "$project": {
    "locations": {
      "$map": {
        "input": {
          "$filter": {
            "input": "$locations",
            "cond": { "$in": [ "$$this", array ] }
          }
        },
        "in": { "k": "$$this", "v": 1 }
      }
    }
  }},
  { "$unwind": "$locations" },
  { "$group": {
    "_id": "$locations.k",
    "v": { "$sum": "$locations.v" }
  }},
  { "$sort": { "_id": 1 } },
  { "$group": {
     "_id": null,
     "obj": { "$push": { "k": "$_id", "v": "$v" } } 
  }},
  { "$replaceRoot": {
    "newRoot": { "$arrayToObject": "$obj" }  
  }}
])

For an older version without things like $arrayToObject, you would transform the results "after" they are returned from the server, like this:

var array = ['A', 'B', '1'];

db.getCollection('mycollection').aggregate([
  { "$project": {
    "locations": {
      "$map": {
        "input": {
          "$filter": {
            "input": "$locations",
            "cond": {
              // "$in": [ "$$this", array ]
              "$or": array.map(a => ({ "$eq": [ "$$this", a ] }) )
            }
          }
        },
        "in": { "k": "$$this", "v": 1 }
      }
    }
  }},
  { "$unwind": "$locations" },
  { "$group": {
    "_id": "$locations.k",
    "v": { "$sum": "$locations.v" }
  }},
  { "$sort": { "_id": 1 } },
  { "$group": {
     "_id": null,
     "obj": { "$push": { "k": "$_id", "v": "$v" } } 
  }},
  /*
  { "$replaceRoot": {
    "newRoot": { "$arrayToObject": "$obj" }  
  }}
  */
]).map(d => 
  d.obj.reduce((acc,curr) => Object.assign(acc,{ [curr.k]: curr.v }),{})
)

In either case, the very first stage is to $project with a $map in order to look at each value in the document array and compare it to the comparison array. In fact we use $filter to simply return only the "matches", and the $map returns a value of 1 to count each occurrence.

There are two basic approaches to the "filtering" either using $in for versions that support the operator, or using $or in older versions before that was introduced.

Frankly, would could simply use $setIntersection to get the matches as long as your document data is "unique", in that no document array contains more than one occurrence of a value. So I'm playing safe here with $filter because I don't know your data. Choose whichever suits.

 // If the "locations" content is meant to be "unique"
 { "$project": {
    "locations": {
      "$map": {
        "input": {
          "$setIntersection": [ "$locations", array ]
        },
        "in": { "k": "$$this", "v": 1 }
      }
    }
  }},

Note the $map output in k and v property form. This will continue as a pattern through the rest of the pipeline.

Because you want to "aggregate" on the k values from the array items, we use $unwind so we can add these together across documents. Then feed that through $group on the values of k and using $sum on each v to effectively "count" the occurrences.

The $sort is completely optional, really you should not care about the order of keys in a single output document. Note the difference from your "desire", but that's just a plain fact that "1" is lexically "less then" "A". So you cannot fight that, and it's just how the world works.

The next stage is simply to $group to a single document. Here we continue by reconstructing to an "array" with objects containing k and v.

The reason for this is because of the final handling. Either where you have a MongoDB with $arrayToObject support ( actually included since 3.4.4, though documentation claims 3.6 ). Where you do, we simply provide the generated "array" as input to this inside a $replaceRoot stage in order to return the final output.

Where you don't have that feature, you can process the cursor results ( here shown using the shell Cursor.map() ) and transform the document before further processing. Any iterator method will do, and most drivers have a Cursor.map(). Not that it really matters that much here, since the aggregation pipeline results in one document in this case.

The JavaScript way as will work in modern shell releases is to simply apply a .reduce() over the array and transform the output object into the desired object output. It's basically the exact same operation that the server does, but just in client code.

Either form returns the desired result:

{
    "1" : 1.0,
    "A" : 2.0,
    "B" : 1.0
}

Upvotes: 1

Related Questions