Erik
Erik

Reputation: 45

How to return the category with max value for every user in postgresql?

This is the table

id category value
1 A 40
1 B 20
1 C 10
2 A 4
2 B 7
2 C 7
3 A 32
3 B 21
3 C 2

I want the result like this

id category
1 A
2 B
2 C
3 A

Upvotes: 1

Views: 700

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656596

For small tables or for only very few rows per user, a subquery with the window function rank() (as demonstrated by The Impaler) is just fine. The resulting sequential scan over the whole table, followed by a sort will be the most efficient query plan.

For more than a few rows per user, this gets increasingly inefficient though.

Typically, you also have a users table holding one distinct row per user. If you don't have it, created it! See:

We can leverage that for an alternative query that scales much better - using WITH TIES in a LATERAL subquery. Requires Postgres 13 or later.

SELECT u.id, t.*
FROM   users u
CROSS  JOIN LATERAL (
   SELECT t.category
   FROM   tbl t
   WHERE  t.id = u.id
   ORDER  BY t.value DESC
   FETCH  FIRST 1 ROWS WITH TIES  -- !
   ) t;

db<>fiddle here

See:

This can use a multicolumn index to great effect - which must exist, of course:

CREATE INDEX ON tbl (id, value);

Or:

CREATE INDEX ON tbl (id, value DESC);

Even faster index-only scans become possible with:

CREATE INDEX ON tbl (id, value DESC, category);

Or (the optimum for the query at hand):

CREATE INDEX ON tbl (id, value DESC) INCLUDE (category);

Assuming value is defined NOT NULL, or we have to use DESC NULLS LAST. See:

To keep users in the result that don't have any rows in table tbl, user LEFT JOIN LATERAL (...) ON true. See:

Upvotes: 2

The Impaler
The Impaler

Reputation: 48769

You can use RANK() to identify the rows you want. Then, filtering is easy. For example:

select *
from (
  select *,
    rank() over(partition by id order by value desc) as rk
  from t
) x
where rk = 1

Result:

 id  category  value  rk 
 --- --------- ------ -- 
 1   A         40     1  
 2   B         7      1  
 2   C         7      1  
 3   A         32     1  

See running example at DB Fiddle.

Upvotes: 0

Related Questions