Count consecutive duplicate values by group

Question

I have searched the site a bit for a solution to this question but have been unable to find an answer that fits precisely what I am looking for. I am attempting to count consecutive duplicate values for each ID number, ordered by the date. My current table resembles the first 3 columns of the table below, whereas the fourth column in what I would like to add.

ID  |  date  | value  | consec_duplicates
1      1/1       1            0
1      1/2       2            0
1      1/3       2            1
1      1/4       2            2
1      1/5       3            0
1      1/6       3            1
2      1/14      1            0
2      1/15      2            0
2      1/16      3            0
2      1/17      3            1
2      1/18      4            0
2      1/19      5            0
3      1/4       1            0
3      1/5       2            0
3      1/6       2            1
3      1/7       2            2
3      1/8       2            3
3      1/9       3            0

Does anybody have any ideas of how to construct this fourth column? Thanks!

Gordon Linoff · Accepted Answer

This is a gap-and-islands problem. One method is the difference of row_number()s to identify the groups.

select t.*,
       dense_rank() over (partition by id order by (seqnum - seqnum_value), value) as grp,
       row_number() over (partition by id, (seqnum - seqnum_value), value order by date) as grp_seqnum
from (select t.*,
             row_number() over (partition by id order by date) as seqnum,
             row_number() over (partition by id, value order by date) as seqnum_v
      from t
     ) t;

This is a bit tricky to understand the first time you see it. If you run the subquery and stare at the results long enough, you'll get why the difference is constant for adjacent values.

EDIT:

I think Jorge is right. Your data doesn't have the same value repeated, so you can just do:

select t.*,
       row_number() over (partition by id, value order by date) as grp_seqnum
from t;

Count consecutive duplicate values by group

Answers (2)

Related Questions