Cina
Cina

Reputation: 10199

R row-wise counting the number of specific occurrences in a column and fill a column that shows the incremental occurrences

I want to count the number of a specific instance in a column and fill a column that shows the row-wise increase.
In the example below, I want to count number of TRUEs in the column TrueFalse and insert it into (or create) column TRUES. In other words, the column TRUES increases incrementally when TrueFalse==TRUE. The same goes for the column FALSES.
the counting should restart when ID changes.

data<-read.table(text="
ID    TrueFalse   
1        NA        
1       TRUE       
1       TRUE       
1        NA        
1       FALSE      
1       FALSE      
2       TRUE       
2       FALSE      
2       NA         
2       NA         
2       FALSE      
2       TRUE       
2       FALSE",header=T)      

result:

ID    TrueFalse   TRUES      FALSES
1        NA         0         0
1       TRUE        1         0
1       TRUE        2         0
1        NA         2         0
1       FALSE       2         1
1       FALSE       2         2
2       TRUE        1         0
2       FALSE       1         1
2       NA          1         1
2       NA          1         1
2       FALSE       1         2 
2       TRUE        2         2
2       FALSE       2         3

Please note that TrueFalse column can be character for counting rather than only TRUE or FALSE.

Upvotes: 0

Views: 151

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388817

We can replace NA values and count using cumsum :

library(dplyr)

data %>%
  group_by(ID) %>%
  mutate(Trues = cumsum(replace(TrueFalse, is.na(TrueFalse), 0)), 
         Falses = cumsum(!replace(TrueFalse, is.na(TrueFalse), TRUE)))


#      ID TrueFalse Trues Falses
#   <int> <lgl>     <dbl>  <int>
# 1     1 NA            0      0
# 2     1 TRUE          1      0
# 3     1 TRUE          2      0
# 4     1 NA            2      0
# 5     1 FALSE         2      1
# 6     1 FALSE         2      2
# 7     2 TRUE          1      0
# 8     2 FALSE         1      1
# 9     2 NA            1      1
#10     2 NA            1      1
#11     2 FALSE         1      2
#12     2 TRUE          2      2
#13     2 FALSE         2      3

If we have character values in TrueFalse we can use :

data %>%
  group_by(ID) %>%
  mutate(Trues = cumsum(replace(TrueFalse, is.na(TrueFalse), '') == 'TRUE'),
         Falses = cumsum(replace(TrueFalse, is.na(TrueFalse), '') == 'FALSE'))

Upvotes: 1

Related Questions