Bharathiraja S
Bharathiraja S

Reputation: 699

MongoDB $group aggregation

I have collection like this

OrgName EmpId   Domain    Date
Google  12345   ABC   2017/01/01
Google  12345   XYZ   2017/02/01 
Google  67890   ABC   2017/03/01
Google  45678   ABC   2017/03/02
Yahoo   69875   HGF   2017/03/02
Google  45678   XYZ   2017/03/03
Google  45678   XYZ   2017/03/03
Google  12345   XYZ   2017/03/03
Google  12345   ABC   2017/03/04
Google  12345   ABC   2017/04/05

I need to fetch which employee having the max "Domain" count and must be in both "ABC" and "XYZ" domain GROUPBY OrgName wise.

I am using below query:

db.Collection1.aggregate([{ "$match" : { "$or" : [ { "Domain": "ABC"},{ "Domain": "XYZ"}]}},
{
    $group :{ "_id": {"OrgName" : "$OrgName", "EmpId" : "$EmpId",
        "Domain" : "$Domain"},
            count:{ $sum : 1 },
            "participantData" : { "$push" : { "EmpId" : "$EmpId" , "Domain" : "$Domain"}}}},
    {$sort:{"count":-1}},
     {$limit: 10}
],{ allowDiskUse: true })

In above example, am expecting result : employee_id=12345 present in both "ABC" and "XYZ" Domain count is 5 (i.e., 12345.ABC = 3 and 12345.XYZ=2).

Upvotes: 0

Views: 33

Answers (1)

s7vr
s7vr

Reputation: 75984

You can try below query.

The below query $group by OrgName, EmpId followed by $match to filter documents where participant array contains both 'ABC' & 'XYZ` value.

$sort the filtered data by count and output first 10 values.

db.collection.aggregate([
  {"$match":{"$or":[{"Domain":"ABC"},{"Domain":"XYZ"}]}},
  {"$group":{
    "_id":{"OrgName":"$OrgName","EmpId":"$EmpId"},
    "count":{"$sum":1},
    "participantData":{"$push":{"EmpId":"$EmpId","Domain":"$Domain"}}
  }},
  {"$match":{"participantData.Domain":{"$all":["ABC","XYZ"]}}},
  {"$sort":{"count":-1}},
  {"$limit":10}
])

Upvotes: 1

Related Questions