Summing consecutive values, broken up by specific value, in R

Question

I'm having a trouble figuring out how to group variables to achieve the desired result from dplyr. I have an experimental dataset set up like this:

  subject   task_phase  block_number trial_number ResponseCorrect
                                         
 1 268301377    1            1            2               1
 2 268301377    1            1            3               1
 3 268301377    1            1            4               1
 4 268301377    1            2            2              -1
 5 268301377    1            2            3               1
 6 268301377    1            2            4               1
 7 268301377    1            3            2               1
 8 268301377    1            3            3              -1
 9 268301377    1            3            4               1
10 268301377    2            1           50               1
11 268301377    2            1           51               1
12 268301377    2            1           52               1
13 268301377    2            2           37              -1
14 268301377    2            2           38               1
15 268301377    2            2           39               1
16 268301377    2            3           41              -1
17 268301377    2            3           42              -1
18 268301377    2            3           43               1

I'm hoping to sum the consecutive "correct" responses, and to have this tally "reset" each time there was an incorrect response:

  subject   task_phase  block_number trial_number ResponseCorrect   ConsecutiveCorrect
                                                 
 1 268301377    1            1            1               1                 1
 2 268301377    1            1            2               1                 2
 3 268301377    1            1            3               1                 3
 4 268301377    1            2            1              -1                 0
 5 268301377    1            2            2               1                 1
 6 268301377    1            2            3               1                 2
 7 268301377    1            3            1               1                 1
 8 268301377    1            3            2              -1                 0
 9 268301377    1            3            3               1                 1
10 268301377    2            1            1               1                 1
11 268301377    2            1            2               1                 2
12 268301377    2            1            3               1                 3
13 268301377    2            2            1              -1                 0
14 268301377    2            2            2               1                 1
15 268301377    2            2            3               1                 2
16 268301377    2            3            1              -1                 0
17 268301377    2            3            2              -1                 0
18 268301377    2            3            3               1                 1

I originally thought I could do something along the lines of df %>% group_by(subject, task_phase, block_number, ResponseCorrect) %>% mutate(ConsecutiveCorrect = cumsum(ResponseCorrect), and that almost works. But, it doesn't give a consecutive value: it just sums up the total number of correct responses per block (. I'm essentially trying to use the -1s as break points that start the summation over again.

Is there a grouping function (Tidyverse or otherwise) that I'm not aware of that could do something along these lines?

Martin Gal · Accepted Answer

You could try

library(dplyr)

data %>% 
  group_by(
    subject, 
    task_phase, 
    block_number, 
    grp = lag(cumsum(ResponseCorrect == -1), default = 0)
    ) %>% 
  mutate(ConsecutiveCorrect = ifelse(ResponseCorrect == -1, 0, cumsum(ResponseCorrect))) %>% 
  ungroup() %>% 
  select(-grp)

which returns

# A tibble: 18 x 6
     subject task_phase block_number trial_number ResponseCorrect ConsecutiveCorrect
                                                      
 1 268301377          1            1            2               1                  1
 2 268301377          1            1            3               1                  2
 3 268301377          1            1            4               1                  3
 4 268301377          1            2            2              -1                  0
 5 268301377          1            2            3               1                  1
 6 268301377          1            2            4               1                  2
 7 268301377          1            3            2               1                  1
 8 268301377          1            3            3              -1                  0
 9 268301377          1            3            4               1                  1
10 268301377          2            1           50               1                  1
11 268301377          2            1           51               1                  2
12 268301377          2            1           52               1                  3
13 268301377          2            2           37              -1                  0
14 268301377          2            2           38               1                  1
15 268301377          2            2           39               1                  2
16 268301377          2            3           41              -1                  0
17 268301377          2            3           42              -1                  0
18 268301377          2            3           43               1                  1

Summing consecutive values, broken up by specific value, in R

Answers (2)

data

Related Questions