Aaron
Aaron

Reputation: 1455

PARTITION BY with DISTINCT

I have the following data. I want to partition on ID and only return unique OrgIDs based on Date desc.

For ID 14, Encompass and Premier were chosen in the output based on the Date. For ID 57, Encompass, Example and Premier were chosen because Example would fall under the unique OrgID category.

ID  Num  Dx   Code   Describe   OrgID   Nm1     Name        Type         Date
14  242  438  26994  Fractures  4534    2332    ENCOMPASS   AGENCY       2020-01-01
14  242  438  26994  Fractures  4533    2332    Premier     Hospital     2020-02-01
14  242  438  26994  Fractures  4533    2332    Premier     Hospital     2019-08-08
14  242  438  26994  Fractures  4534    2332    ENCOMPASS   AGENCY       2019-07-09
14  242  438  26994  Fractures  4534    2332    ENCOMPASS   AGENCY       2019-08-07
57  242  438  22699  Nervous    4533    2332    Premier     Hospital     2018-01-12
57  242  438  22699  Nervous    4534    2332    ENCOMPASS   AGENCY       2020-01-09
57  242  438  22699  Nervous    4533    2332    Premier     Hospital     2020-01-01
57  242  438  22699  Nervous    4535    2332    Example     Nurse        2019-11-11
57  242  438  22699  Nervous    4534    2332    ENCOMPASS   AGENCY       2019-06-30

Expected output -

ID  Num  Dx   Code   Describe   OrgID   Nm1     Name        Type         Date
14  242  438  26994  Fractures  4534    2332    ENCOMPASS   AGENCY       2020-01-01
14  242  438  26994  Fractures  4533    2332    Premier     Hospital     2020-02-01
57  242  438  22699  Nervous    4533    2332    Premier     Hospital     2020-01-01
57  242  438  22699  Nervous    4534    2332    ENCOMPASS   AGENCY       2020-01-09
57  242  438  22699  Nervous    4535    2332    Example     Nurse        2019-11-11

Upvotes: 0

Views: 501

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269873

In Postgres, I would recommend DISTINCT ON:

select distinct on (id, orgid) t.*
from t
order by id, orgid, date desc;

This is typically faster than the corresponding query using window functions or subqueries. And is quite speedy with an index on (id, orgid, date desc).

Upvotes: 0

sticky bit
sticky bit

Reputation: 37472

It seems like you could use row_number() to get the latest for each id and orgid pair.

SELECT x.id,
       x.num,
       ...
       x.date
       FROM (SELECT t.id,
                    t.num,
                    ...
                    t.date,
                    row_number() OVER (PARTITION BY t.id,
                                                    t.orgid
                                       ORDER BY t.date DESC) rn
                    FROM elbat t) x
       WHERE x.rn = 1;

Upvotes: 1

Related Questions