The Hawk
The Hawk

Reputation: 1568

SQL Get Other Rows From Aggregate Function

I have an aggregate function that does a group by (col A). It selects the maximum value from a set of columns(col B), but I also want to return another value from a column in the same row(col C). But if it groups 3 rows it selects the first value from column C not the column with the maximum (MAX(col B)).

A    B    C
1     75  jkl
1    100  abc
1    125  dae
2    200  def
3    300  ghi

"SELECT A, MAX(B), C FROM myTable where B > 50 GROUP BY A"

returns (first row) A => 1, B => 125, C => jkl

I want it to return 

A => 1, B => 125, C => dae

Upvotes: 9

Views: 14429

Answers (3)

bwperrin
bwperrin

Reputation: 692

This is a very common problem: "show me other columns on the rows matching my min()/max() criteria." On large tables, subquery or rank function strategies usually scan the entire table and can become very slow.

If you're willing to get your head around it, this is by far the most performant way to handle this (though not the most readable):

SELECT A, cast(left(val, 8) as int) AS B, substring(val, 9, 999) AS C
FROM ( SELECT A, max(str(B, 8) + C) AS val FROM myTable GROUP BY A) t

You can concatenate anything to what you're maxing as long as you concatenate the same length before the final field (hence str function's leading spaces to sort numerical values correctly), then extract it in the outer query. All index-friendly. Voilá.

Note that this will return different (and I think better) results than the solutions posted by bluefeet and JW, in that if there are multiple matching max values per group, this method will pick a winner (the largest C) whereas the others will return multiple records. So, if the 3rd B value were 100 instead of 125, this will return 1, 100, dae whereas the other solutions would return both 1, 100, abd and 1, 100, dae.

Upvotes: 4

Taryn
Taryn

Reputation: 247880

You will want to use a subquery that will get the max(b) by each A and then join that value back to your table to return the remaining columns that match the values of the subquery:

select *
from mytable t1
inner join
(
  select A, max(b) B
  from mytable
  where b >50
  group by a
) t2
  on t1.a = t2.a
  and t1.b = t2.b
where t1.b >50

See SQL Fiddle with Demo

Upvotes: 16

John Woo
John Woo

Reputation: 263933

since you haven't mention the RDBMS you are using, use this query which works on almost all RDBMS

SELECT  a.*
FROM    tableName a
        INNER JOIN
        (
            SELECT  A, MAX(b) max_B
            FROM    tableName
            WHERE   b > 50
            GROUP   BY A
        ) b ON a.A = b.A   AND
            a.B = b.max_B

But if your RDBMS support window functions, you can use DENSE_RANK()

SELECT  A, B, C
FROM    
        (
            SELECT  A, B, C,
                    DENSE_RANK() OVER (PARTITION A ORDER BY B DESC) rn
            FROM    tableName
            WHERE   b > 50
            GROUP   BY      
        ) a
WHERE   rn = 1

Upvotes: 5

Related Questions