Orion
Orion

Reputation: 1104

Generateing a per-group sequence based on a specific value in an existing column

Related to my other question - given the following data frame

df0 <- data.frame (id  = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3),
                   val = c(0, 1, 1, 0, 0, 0, 1, 0, 1, 1))

what is the most straightforward way (i.e. one liner) to add a third column in which, for each id, sequence numbers appear only when val == 1, like the following?

   id val  seq
1   1   0    NA
2   1   1    1
3   1   1    2
4   2   0    NA
5   2   0    NA
6   2   0    NA
7   3   1    1
8   3   0    NA
9   3   1    2
10  3   1    3

Upvotes: 1

Views: 523

Answers (2)

Arun
Arun

Reputation: 118879

Using data.table:

require(data.table)
setDT(df0)[val == 1L, seq := seq_len(.N), by=id]
#     id val seq
#  1:  1   0  NA
#  2:  1   1   1
#  3:  1   1   2
#  4:  2   0  NA
#  5:  2   0  NA
#  6:  2   0  NA
#  7:  3   1   1
#  8:  3   0  NA
#  9:  3   1   2
# 10:  3   1   3

.N contains the number of observations per group. Start with the HTML vignettes for more.

Upvotes: 3

dimitris_ps
dimitris_ps

Reputation: 5951

With 0 instead of NA:

library(dplyr)

df0 <- df0 %>% group_by(id) %>% mutate(seq = (val==1)*cumsum(val))

with NA

df0[df0$seq==0, 3] <- NA

Upvotes: 1

Related Questions