Calculate lengths of sequences of repeating numbers in a vector in R

Question

Here is the data:

marker <- c(0,0,0,0,3,3,0,0,5,5,5,0,0,0,
            1,1,2,2,2,2,0,0,1,1,1,3,3,3,
            1,1,2,2,2,0,0,1,1,1,5,5,5,5)

Those markers show what the participant was doing during an eye tracking study, such that 0 = no trial, 1 = trial onset, 2, 3, 5 = different types of tasks. The data before the first 1 is eye tracker test and can be discarded.

What I need to do (preferably with dplyr):

Delete data before the first 1
Calculate the length of each sequence of repeating numbers (n_samples)
Assign ID numbers to trials and 0's to no trial and trial onset (trial_number)

Desired output:

marker  n_samples  trial_number
1       2          0
1       2          0
2       4          1
2       4          1
2       4          1
2       4          1
0       2          0
0       2          0
1       3          0
1       3          0
1       3          0
3       3          2
3       3          2
3       3          2
1       2          0
1       2          0
2       3          3
2       3          3
2       3          3
0       2          0
0       2          0
1       3          0
1       3          0
1       3          0
5       4          4
5       4          4
5       4          4
5       4          4

I found this answer, but wasn't able to modify the code to fit my task.

Thank you!

Ronak Shah · Accepted Answer

Using dplyr and data.table's rleid function.

library(dplyr)

tibble(marker) %>%
  #Drop rows before first 1
  filter(row_number() >= match(1, marker)) %>%
  #Count samples in each group
  add_count(grp = data.table::rleid(marker), name = 'n_samples') %>%
  #Create trial number
  mutate(trial_number = with(rle(!marker %in% c(1, 0)), 
                            rep(cumsum(values) * values, lengths))) %>%
  select(-grp)

This returns -

#   marker n_samples trial_number
#1       1         2            0
#2       1         2            0
#3       2         4            1
#4       2         4            1
#5       2         4            1
#6       2         4            1
#7       0         2            0
#8       0         2            0
#9       1         3            0
#10      1         3            0
#11      1         3            0
#12      3         3            2
#13      3         3            2
#14      3         3            2
#15      1         2            0
#16      1         2            0
#17      2         3            3
#18      2         3            3
#19      2         3            3
#20      0         2            0
#21      0         2            0
#22      1         3            0
#23      1         3            0
#24      1         3            0
#25      5         4            4
#26      5         4            4
#27      5         4            4
#28      5         4            4

Calculate lengths of sequences of repeating numbers in a vector in R

Answers (2)

Related Questions