Reputation: 1104
Related to my other question - given the following data frame
df0 <- data.frame (id = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3),
val = c(0, 1, 1, 0, 0, 0, 1, 0, 1, 1))
what is the most straightforward way (i.e. one liner) to add a third column in which, for each id
, sequence numbers appear only when val == 1
, like the following?
id val seq
1 1 0 NA
2 1 1 1
3 1 1 2
4 2 0 NA
5 2 0 NA
6 2 0 NA
7 3 1 1
8 3 0 NA
9 3 1 2
10 3 1 3
Upvotes: 1
Views: 523
Reputation: 118879
Using data.table
:
require(data.table)
setDT(df0)[val == 1L, seq := seq_len(.N), by=id]
# id val seq
# 1: 1 0 NA
# 2: 1 1 1
# 3: 1 1 2
# 4: 2 0 NA
# 5: 2 0 NA
# 6: 2 0 NA
# 7: 3 1 1
# 8: 3 0 NA
# 9: 3 1 2
# 10: 3 1 3
.N
contains the number of observations per group. Start with the HTML vignettes for more.
Upvotes: 3
Reputation: 5951
With 0
instead of NA
:
library(dplyr)
df0 <- df0 %>% group_by(id) %>% mutate(seq = (val==1)*cumsum(val))
with NA
df0[df0$seq==0, 3] <- NA
Upvotes: 1