Reputation: 677
the data looks like:
df <- data.frame("Grp"=c(rep("A",10),rep("B",10)),
"Year"=c(seq(2001,2010,1),seq(2001,2010,1)),
"Treat"=c(as.character(c(0,0,1,1,1,1,0,0,1,1)),
as.character(c(1,1,1,0,0,0,1,1,1,0))))
df
Grp Year Treat
1 A 2001 0
2 A 2002 0
3 A 2003 1
4 A 2004 1
5 A 2005 1
6 A 2006 1
7 A 2007 0
8 A 2008 0
9 A 2009 1
10 A 2010 1
11 B 2001 1
12 B 2002 1
13 B 2003 1
14 B 2004 0
15 B 2005 0
16 B 2006 0
17 B 2007 1
18 B 2008 1
19 B 2009 1
20 B 2010 0
All I want is to generate another col seq
to count the sequence of Treat
by Grp
, maintaining the sequence of Year
. I think the hard part is that when Treat
turns to 0, seq
should be 0 or whatever, and the sequence of Treat
should be re-counted when it turns back to non-zero again. An example of the final dataframe looks like below:
Grp Year Treat seq
1 A 2001 0 0
2 A 2002 0 0
3 A 2003 1 1
4 A 2004 1 2
5 A 2005 1 3
6 A 2006 1 4
7 A 2007 0 0
8 A 2008 0 0
9 A 2009 1 1
10 A 2010 1 2
11 B 2001 1 1
12 B 2002 1 2
13 B 2003 1 3
14 B 2004 0 0
15 B 2005 0 0
16 B 2006 0 0
17 B 2007 1 1
18 B 2008 1 2
19 B 2009 1 3
20 B 2010 0 0
Any suggestions would be much appreciated!
Upvotes: 0
Views: 140
Reputation: 388817
With data.table
rleid
, you can do :
library(dplyr)
df %>%
group_by(Grp, grp = data.table::rleid(Treat)) %>%
mutate(seq = row_number() * as.integer(Treat)) %>%
ungroup %>%
select(-grp)
# Grp Year Treat seq
# <chr> <dbl> <chr> <int>
# 1 A 2001 0 0
# 2 A 2002 0 0
# 3 A 2003 1 1
# 4 A 2004 1 2
# 5 A 2005 1 3
# 6 A 2006 1 4
# 7 A 2007 0 0
# 8 A 2008 0 0
# 9 A 2009 1 1
#10 A 2010 1 2
#11 B 2001 1 1
#12 B 2002 1 2
#13 B 2003 1 3
#14 B 2004 0 0
#15 B 2005 0 0
#16 B 2006 0 0
#17 B 2007 1 1
#18 B 2008 1 2
#19 B 2009 1 3
#20 B 2010 0 0
Upvotes: 1