Reputation: 848
This should be a simple problem to solve, but for some reason I cannot understand how to do this efficiently...
Let's say I have a table (postgres) that looks like this:
id | date | value |
---|---|---|
1 | 2021-04-01 | 1 |
1 | 2021-04-03 | 10 |
1 | 2021-04-04 | 8 |
1 | 2021-04-05 | 3 |
1 | 2021-04-08 | 5 |
2 | 2021-04-04 | 3 |
2 | 2021-04-05 | 5 |
2 | 2021-04-07 | 5 |
2 | 2021-04-10 | 9 |
2 | 2021-04-12 | 11 |
My goal is to extract 1 row per id, which is the most recent record for that id.
The important feature of this table to note, is that the most recent date (by id) is not the same across ids. So I cannot simply query for the max(date) and use that to query the data. I must first find the most recent date for each id, and use that to query the data.
A successful query would result in this response:
id | date | value |
---|---|---|
1 | 2021-04-08 | 5 |
2 | 2021-04-12 | 11 |
I am making this query on an extremely large amount of data as well, so this solution needs to be as optimized as possible.
Upvotes: 0
Views: 291
Reputation: 24593
you can use window function :
select * from
(
select * , row_number() over (partition by id order by date desc) rn
from yourtable
) t
where rn = 1
Upvotes: 2