Reputation: 33
I have a table like this, and I want to return the top two person (by name) with the highest salary, and also the record with the corresponding salary. Here is the table
id, name, salary
1, Tom, 200
2, Tom, 300
3, Bob, 400
4, Bob, 500
5, Alice, 600
6, Alice, 700
I used this command
Employer.select("employers.*, max(employers.salary) as maxsalary").group("employers.name").order("maxsalary desc").limit(2)
Desired return:
id, name, salary
6, Alice, 700
4, Bob, 500
What I got seems to be like this:
id, name, salary
5, Alice, 600
3, Bob, 400
Is there anyway to select the records responding to the max ? Any comment/answer is much appreciated.
Upvotes: 0
Views: 1978
Reputation: 3412
This Question is very tricky actually! It seems very easy but it's not.
The Query:
Employer.joins(%Q|
LEFT JOIN employers as e
ON
e.name = employers.name
AND
employers.salary < e.salary
|).where('e.salary IS NULL').order('employers.salary DESC').limit(2)
HOW DOES THAT WORK! (I've been there)
We want to make sure that we only have the highest salary for each employer and then get the highest 2 of those.
Some Theoretical Stuff (skip this part if you only want to understand the query)
Let Salary be a function S(name,id) where it returns a value given the name and id To prove that the given salary (S(name,id)) is the highest we have to prove that We want to prove either
OR
The first approach will need us to get all the records for that name which I do not really like.
The second one will need a smart way to say there can be no record higher than this one.
Back to SQL
If we left joins the table on the name and salary being less than the joined table:
%Q|
LEFT JOIN employers as e
ON
e.name = employers.name
AND
employers.salary < e.salary
|
we make sure that all records that has another record with higher salary for the same user to be joined:
employers.id, employers.name, employers.salary, e.id, e.name, e.salary
1 , Tom , 200 , 2 , Tom , 300
2 , Tom , 300
3 , Bob , 400 , 4 , Bob , 500
4 , Bob , 500
5 , Alice , 600 , 6 , Alice , 700
6 , Alice , 700
That will help us filter for the highest salary for each employer with no grouping needed:
where('e.salary IS NULL')
employers.id, employers.name, employers.salary, e.id, e.name, e.salary
2 , Tom , 300
4 , Bob , 500
6 , Alice , 700
Now all we need to do is sort:
order('employers.salary DESC')
employers.id, employers.name, employers.salary, e.id, e.name, e.salary
6 , Alice , 700
4 , Bob , 500
2 , Tom , 300
Then limit
limit(2)
employers.id, employers.name, employers.salary, e.id, e.name, e.salary
6 , Alice , 700
4 , Bob , 500
And that's the answer we need.
Why don't we
1.
Employer.order('salary desc').limit(2)
Because this will get us the records with the highest salaries independent of the name
employers.id, employers.name, employers.salary
5 , Alice , 600
6 , Alice , 700
2.
Employer.select('DISTINCT(name)').order('salary desc').limit(2)
The problem with that is it will only keep the first appearance of the name then sort
employers.id, employers.name, employers.salary
1 , Tom , 200
3 , Bob , 400
5 , Alice , 600
Upvotes: 4