Reputation: 651
I have searched the site a bit for a solution to this question but have been unable to find an answer that fits precisely what I am looking for. I am attempting to count consecutive duplicate values for each ID number, ordered by the date. My current table resembles the first 3 columns of the table below, whereas the fourth column in what I would like to add.
ID | date | value | consec_duplicates
1 1/1 1 0
1 1/2 2 0
1 1/3 2 1
1 1/4 2 2
1 1/5 3 0
1 1/6 3 1
2 1/14 1 0
2 1/15 2 0
2 1/16 3 0
2 1/17 3 1
2 1/18 4 0
2 1/19 5 0
3 1/4 1 0
3 1/5 2 0
3 1/6 2 1
3 1/7 2 2
3 1/8 2 3
3 1/9 3 0
Does anybody have any ideas of how to construct this fourth column? Thanks!
Upvotes: 0
Views: 1824
Reputation: 1271231
This is a gap-and-islands problem. One method is the difference of row_number()
s to identify the groups.
select t.*,
dense_rank() over (partition by id order by (seqnum - seqnum_value), value) as grp,
row_number() over (partition by id, (seqnum - seqnum_value), value order by date) as grp_seqnum
from (select t.*,
row_number() over (partition by id order by date) as seqnum,
row_number() over (partition by id, value order by date) as seqnum_v
from t
) t;
This is a bit tricky to understand the first time you see it. If you run the subquery and stare at the results long enough, you'll get why the difference is constant for adjacent values.
EDIT:
I think Jorge is right. Your data doesn't have the same value repeated, so you can just do:
select t.*,
row_number() over (partition by id, value order by date) as grp_seqnum
from t;
Upvotes: 1
Reputation: 60513
When the values are actually increasing all the time then this should work:
row_number() over (partition by id, value order by date) - 1
Otherwise Teradata has an extension to Standard SQL for cases like this:
row_number()
over (partition by id
order by date
RESET WHEN MIN(value) -- previous value not equal to current
OVER (partition by id
order by date
rows between 1 preceding and 1 preceding) <> value
) - 1
Upvotes: 1