How can I retrieve the complete record with the maximun value grouped by another in Postgres?

Question

I have a table (it's a big query in fact, so don't use joins over the table please) as follows:

date     | priority | data
20200301 | 1        | 0.3
20200301 | 2        | 0.4
20200302 | 2        | 0.4
20200302 | 3        | 0.1
20200303 | 1        | 0.8

So, I want the date and the data with the LOWEST priority of each date, so the result of the query I'm looking for would be:

date     | priority | data
20200301 | 1        | 0.3
20200302 | 2        | 0.4
20200303 | 1        | 0.8

Whenever I try to make a group by clause, that query cannot retrieve the data column nor support different values on the data column.

Marth · Accepted Answer

You can use a the row_number window function for this:

CREATE TABLE t (
    "date"     INTEGER,
    "priority" INTEGER,
    "data"     FLOAT
);

INSERT INTO t
    ("date", "priority", "data")
VALUES ('20200301', '1', '0.3')
     , ('20200301', '2', '0.4')
     , ('20200302', '2', '0.4')
     , ('20200302', '3', '0.1')
     , ('20200303', '1', '0.8');


SELECT *
FROM (
    SELECT *, row_number() OVER (PARTITION BY date ORDER BY priority)
    FROM t
) f
WHERE row_number = 1

returns:

+--------+--------+----+----------+
|date    |priority|data|row_number|
+--------+--------+----+----------+
|20200301|1       |0.3 |1         |
|20200302|2       |0.4 |1         |
|20200303|1       |0.8 |1         |
+--------+--------+----+----------+

As mentioned by @david in the comments, it might be more efficient to filter the rows based on "priority = min_priority_for_date" (instead of ranking them and filtering them afterwards):

SELECT *
FROM t
WHERE (date, priority) IN (
    SELECT date, MIN(priority)
    FROM t
    GROUP BY date
)

How can I retrieve the complete record with the maximun value grouped by another in Postgres?

Answers (1)

Related Questions