Reputation: 2797
I have a collection like
{
"_id" : ObjectId("5738cb363bb56eb8f76c2ba8"),
"records" : [
{
"Name" : "Joe",
"Salary" : 70000,
"Department" : "IT"
}
]
},
{
"_id" : ObjectId("5738cb363bb56eb8f76c2ba9"),
"records" : [
{
"Name" : "Henry",
"Salary" : 80000,
"Department" : "Sales"
},
{
"Name" : "Jake",
"Salary" : 40000,
"Department" : "Sales"
}
]
},
{
"_id" : ObjectId("5738cb363bb56eb8f76c2baa"),
"records" : [
{
"Name" : "Sam",
"Salary" : 90000,
"Department" : "IT"
},
{
"Name" : "Tom",
"Salary" : 50000,
"Department" : "Sales"
}
]
}
I want to have the results with the highest salary by each department
{"Name": "Sam", "Salary": 90000, "Department": "IT"}
{"Name": "Henry", "Salary": 80000, "Department": "Sales"}
I could get the highest salary. But I could not get the corresponding employee names.
db.HR.aggregate([
{ "$unwind": "$records" },
{ "$group":
{
"_id": "$records.Department",
"max_salary": { "$max": "$records.Salary" }
}
}
])
Could somebody help me?
Upvotes: 8
Views: 2861
Reputation: 61225
You need to $sort
your document after $unwind
and use the $first
operator in the $group
stage. You can also use the $last
operator in which case you will need to sort your documents in ascending order
db.HR.aggregate([
{ '$unwind': '$records' },
{ '$sort': { 'records.Salary': -1 } },
{ '$group': {
'_id': '$records.Department',
'Name': { '$first': '$records.Name' } ,
'Salary': { '$first': '$records.Salary' }
}}
])
which produces:
{ "_id" : "Sales", "Name" : "Henry", "Salary" : 80000 }
{ "_id" : "IT", "Name" : "Sam", "Salary" : 90000 }
To return the maximum salary and employees list for each department you need to use the $max
in your group stage to return the maximum "Salary" for each group then use $push
accumulator operator to return a list of "Name" and "Salary" for all employees for each group. From there you need to use the $map
operator in your $project
stage to return a list of names alongside the maximum salary. Of course the $cond
here is used to compare each employee salary to the maximum value. The $setDifference
does his work which is filter out all false
and is fine as long as the data being filtered is "unique". In this case it "should" be fine, but if any two results contained the same "name" then it would skew results by considering the two to be one.
db.HR.aggregate([
{ '$unwind': '$records' },
{ '$group': {
'_id': '$records.Department',
'maxSalary': { '$max': '$records.Salary' },
'persons': {
'$push': {
'Name': '$records.Name',
'Salary': '$records.Salary'
}
}
}},
{ '$project': {
'maxSalary': 1,
'persons': {
'$setDifference': [
{ '$map': {
'input': '$persons',
'as': 'person',
'in': {
'$cond': [
{ '$eq': [ '$$person.Salary', '$maxSalary' ] },
'$$person.Name',
false
]
}
}},
[false]
]
}
}}
])
which yields:
{ "_id" : "Sales", "maxSalary" : 80000, "persons" : [ "Henry" ] }
{ "_id" : "IT", "maxSalary" : 90000, "persons" : [ "Sam" ] }
Upvotes: 8
Reputation: 4117
Another possible solution:
db.HR.aggregate([
{"$unwind": "$records"},
{"$group":{
"_id": "$records.Department",
"arr": {"$push": {"Name":"$records.Name", "Salary":"$records.Salary"}},
"maxSalary": {"$max":"$records.Salary"}
}},
{"$unwind": "$arr"},
{"$project": {
"_id":1,
"arr":1,
"isMax":{"$eq":["$arr.Salary", "$maxSalary"]}
}},
{"$match":{
"isMax":true
}}
])
This solution takes advantage of the $eq operator to compare two fields in the $project stage.
Test case:
db.HR.insert({"records": [{"Name": "Joe", "Salary": 70000, "Department": "IT"}]})
db.HR.insert({"records": [{"Name": "Henry", "Salary": 80000, "Department": "Sales"}, {"Name": "Jake", "Salary": 40000, "Department": "Sales"}, {"Name": "Santa", "Salary": 90000, "Department": "IT"}]})
db.HR.insert({"records": [{"Name": "Sam", "Salary": 90000, "Department": "IT"}, {"Name": "Tom", "Salary": 50000, "Department": "Sales"}]})
Result:
{ "_id" : "Sales", "arr" : { "Name" : "Henry", "Salary" : 80000 }, "isMax" : true }
{ "_id" : "IT", "arr" : { "Name" : "Santa", "Salary" : 90000 }, "isMax" : true }
{ "_id" : "IT", "arr" : { "Name" : "Sam", "Salary" : 90000 }, "isMax" : true }
Upvotes: 1
Reputation: 20129
Its not the most intuitive thing, but instead of $max
you should be using $sort
and $first
:
{ "$unwind": "$records" },
{ "$sort": { "$records.Salary": -1},
{ "$group" :
{
"_id": "$records.Department",
"max_salary": { "$first": "$records.Salary" },
"name": {$first: "$records.Name"}
}
}
Alternatively, I think this is doable using the $$ROOT
operator (fair warning: I've not actually tried this) -
{ "$unwind": "$records" },
{ "$group":
{
"_id": "$records.Department",
"max_salary": { "$max": "$records.Salary" }
"name" : "$$ROOT.records.Name"
}
}
}
Upvotes: 3