Reputation: 1568
I have an aggregate function that does a group by (col A). It selects the maximum value from a set of columns(col B), but I also want to return another value from a column in the same row(col C). But if it groups 3 rows it selects the first value from column C not the column with the maximum (MAX(col B)).
A B C
1 75 jkl
1 100 abc
1 125 dae
2 200 def
3 300 ghi
"SELECT A, MAX(B), C FROM myTable where B > 50 GROUP BY A"
returns (first row) A => 1, B => 125, C => jkl
I want it to return
A => 1, B => 125, C => dae
Upvotes: 9
Views: 14429
Reputation: 692
This is a very common problem: "show me other columns on the rows matching my min()/max() criteria." On large tables, subquery or rank function strategies usually scan the entire table and can become very slow.
If you're willing to get your head around it, this is by far the most performant way to handle this (though not the most readable):
SELECT A, cast(left(val, 8) as int) AS B, substring(val, 9, 999) AS C
FROM ( SELECT A, max(str(B, 8) + C) AS val FROM myTable GROUP BY A) t
You can concatenate anything to what you're max
ing as long as you concatenate the same length before the final field (hence str
function's leading spaces to sort numerical values correctly), then extract it in the outer query. All index-friendly. Voilá.
Note that this will return different (and I think better) results than the solutions posted by bluefeet and JW, in that if there are multiple matching max values per group, this method will pick a winner (the largest C) whereas the others will return multiple records. So, if the 3rd B value were 100 instead of 125, this will return 1, 100, dae whereas the other solutions would return both 1, 100, abd and 1, 100, dae.
Upvotes: 4
Reputation: 247880
You will want to use a subquery that will get the max(b)
by each A
and then join that value back to your table to return the remaining columns that match the values of the subquery:
select *
from mytable t1
inner join
(
select A, max(b) B
from mytable
where b >50
group by a
) t2
on t1.a = t2.a
and t1.b = t2.b
where t1.b >50
Upvotes: 16
Reputation: 263933
since you haven't mention the RDBMS you are using, use this query which works on almost all RDBMS
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT A, MAX(b) max_B
FROM tableName
WHERE b > 50
GROUP BY A
) b ON a.A = b.A AND
a.B = b.max_B
But if your RDBMS support window functions, you can use DENSE_RANK()
SELECT A, B, C
FROM
(
SELECT A, B, C,
DENSE_RANK() OVER (PARTITION A ORDER BY B DESC) rn
FROM tableName
WHERE b > 50
GROUP BY
) a
WHERE rn = 1
Upvotes: 5