Reputation: 645
I have the following table
ID NAME DATE VALUE
1 John 2022-02-01 27
2 John 2022-02-01 9
3 John 2022-01-31 -12
4 John 2021-12-15 32
5 Tom 2021-10-22 11
6 Tom 2021-05-05 201
7 Tom 2020-11-27 76
8 Bob 2022-01-07 43
9 Bob 2021-08-26 3
10 Bob 2021-05-13 12
11 Bob 2021-03-08 2
I want to write a query that ignores the rows with the latest DATE
for each NAME
ID NAME DATE VALUE
3 John 2022-01-31 -12
4 John 2021-12-15 32
6 Tom 2021-05-05 201
7 Tom 2020-11-27 76
9 Bob 2021-08-26 3
10 Bob 2021-05-13 12
11 Bob 2021-03-08 2
No idea how to write a query like that....
Upvotes: 1
Views: 168
Reputation: 4272
When writing a query to answer get all the rows with (or without) the highest X you can often solve it using window functions like Zakaria's answer, DISTINCT ON, or lateral joins.
I personally prefer the expressiveness of window functions but there are times where lateral joins are better as they can be more performant
Here is how to do it with a lateral
.
First we need to project a table where for each row in your original table we tie it to the row in it's group (i.e. rows with the same name) with the latest date.
select users.*, latest_dates.* from users
left join lateral (
select u.* from users as u
where u.name = users.name AND u.date > users.date
order by u.date DESC
limit 1
) as latest_dates on true;
id | name | date | value | id | name | date | value
----+------+---------------------+-------+----+------+---------------------+-------
1 | John | 2022-02-01 00:00:00 | 27 | | | |
2 | John | 2022-02-01 00:00:00 | 9 | | | |
3 | John | 2022-01-31 00:00:00 | -12 | 1 | John | 2022-02-01 00:00:00 | 27
4 | John | 2021-12-15 00:00:00 | 32 | 1 | John | 2022-02-01 00:00:00 | 27
5 | Tom | 2021-10-22 00:00:00 | 11 | | | |
6 | Tom | 2021-05-05 00:00:00 | 201 | 5 | Tom | 2021-10-22 00:00:00 | 11
7 | Tom | 2020-11-27 00:00:00 | 76 | 5 | Tom | 2021-10-22 00:00:00 | 11
8 | Bob | 2022-01-07 00:00:00 | 43 | | | |
9 | Bob | 2021-08-26 00:00:00 | 3 | 8 | Bob | 2022-01-07 00:00:00 | 43
10 | Bob | 2021-05-13 00:00:00 | 12 | 8 | Bob | 2022-01-07 00:00:00 | 43
11 | Bob | 2021-03-08 00:00:00 | 2 | 8 | Bob | 2022-01-07 00:00:00 | 43
Above we can see for rows which are the latest date in the group will have null values on the right hand side. So its simply a case of creating a query to exclude such values.
select users.*, latest_dates.* from users
left join lateral (
select u.* from users as u
where u.name = users.name AND u.date > users.date
order by u.date DESC
limit 1
) as latest_dates on true
where latest_dates.date IS NOT NULL;
This gives you the result you are looking for. Here is the fiddle
Upvotes: 1
Reputation: 4786
You can use rank()
to rank the rows by descending DATE
, then filter out rows with rank = 1 :
select ID, NAME, DATE, VALUE from
(select *,
rank() over(partition by NAME order by DATE desc) r
from table_name) t
where r > 1;
Upvotes: 3
Reputation: 30920
We could do:
WITH max_date_table AS
(
SELECT MAX(DATE) AS max_date_value, NAME
FROM mytable
GROUP BY NAME
)
SELECT mytable.ID, mytable.NAME, mytable.DATE, mytable.VALUE
FROM mytable
INNER JOIN max_date_table
ON mytable.NAME=max_date_table.NAME
WHERE mytable.DATE != max_date_table.max_date_value
Upvotes: 0