Reputation: 114
I have a table recording customer step on daily basis. The table had Id, date and step column. Some rows contained different steps on the same day for the same Id. Sample as shown below on 5/3/2020 and 5/4/2020 for Id 1: | Id | Date | Step | |:-----|:---------|:-----| | 1 | 5/1/2020 | 1 | | 1 | 5/2/2020 | 1 | | 1 | 5/3/2020 | 0 | | 1 | 5/3/2020 | 5 | | 1 | 5/4/2020 | 2 | | 1 | 5/4/2020 | 10 | | 1 | 5/5/2020 | 1 | | 2 | 5/1/2020 | 1 | | 2 | 5/2/2020 | 2 | | 2 | 5/3/2020 | 0 |
I want to delete rows that contain lesser step, which is 5/3/2020 for 0 step, 5/4/2020 for 2 step for Id 1.
I had tried using row_number() like this:
SELECT
Id,
Date,
step,
ROW_NUMBER() OVER (PARTITION BY Id, Date ORDER BY Id, Date) AS rn
FROM
`dataset.step`
WHERE rn>1
But that will give me rows with higher step, which is not want I want. I also able to select rows with fewer step like this:
SELECT * FROM
`dataset.step` AS A
INNER JOIN
`dataset.step` AS B
ON A.Id = B.Id
AND A.Date = B.Date
WHERE A.step < B.step
But find no way to use it for delete.
Upvotes: 0
Views: 76
Reputation: 173210
Use below approach
select *
from your_table
qualify 1 = row_number() over win
window win as (partition by id, date order by step desc)
if applied to sample data in your question - output is
Upvotes: 2