Jeremy
Jeremy

Reputation: 113

How to select the latest date for each group by number?

I've been stuck on this question for a while, and I was wondering if the community would be able to direct me in the right direction?

I have some tag IDs that needs to be grouped, with exceptions (column: deleted) that need to be retained in the results. After which, for each grouped tag ID, I need to select the one with the latest date. How can I do this? An example below:

ID  |  TAG_ID  |  DATE     |  DELETED 
1   |  300     |  05/01/20 |  null
2   |  300     |  03/01/20 |  04/01/20
3   |  400     |  06/01/20 |  null
4   |  400     |  05/01/20 |  null
5   |  400     |  04/01/20 |  null
6   |  500     |  03/01/20 |  null
7   |  500     |  02/01/20 |  null

I am trying to reach this outcome:

ID  |  TAG_ID  |  DATE     |  DELETED 
1   |  300     |  05/01/20 |  null
2   |  300     |  03/01/20 |  04/01/20
3   |  400     |  06/01/20 |  null
6   |  500     |  03/01/20 |  null

So, firstly if there is a date in the "DELETED" column, I would like the row to be present. Secondly, for each unique tag ID, I would like the row with the latest "DATE" to be present.

Hopefully this question is clear. Would appreciate your feedback and help! A big thanks in advance.

Upvotes: 1

Views: 54

Answers (2)

forpas
forpas

Reputation: 164064

You need 2 conditions combined with OR in the WHERE clause:
the 1st is deleted is not null, or
the 2nd that there isn't any other row with the same tag_id and date later than the current row's date, meaning that the current row's date is the latest:

select t.* from tablename t
where t.deleted is not null
or not exists (
  select 1 from tablename
  where tag_id = t.tag_id and date > t.date
) 

See the demo.
Results:

| id  | tag_id | date       | deleted  |
| --- | ------ | ---------- | -------- |
| 1   | 300    | 2020-05-01 |          |
| 2   | 300    | 2020-03-01 | 04/01/20 |
| 3   | 400    | 2020-06-01 |          |
| 6   | 500    | 2020-03-01 |          |

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269483

Your results seem to be something like this:

select t.*
from (select t.*,
             row_number() over (partition by tag_id, deleted order by date desc) as seqnum
      from t
     ) t
where seqnum = 1 or deleted is not null;

This takes one row where deleted is null -- the most recent row. It also keeps each row where deleted is not null.

Upvotes: 2

Related Questions