Reputation: 659
Hi I have something like:
df<-data.frame(group=c(1, 1, 1, 1, 2, 2, 2,3,3,3,4,4,4),
number=c(0,1,1,1,1,1,0,1,0,1,0,1,1))
I want my 'want' to count the times 'number'=1 repeats per 'group'. Then I want the max number of times it repeats (want2). I want the 'want' to reset to 1 when 'number'=0:
dfwant<-data.frame(group=c(1, 1, 1, 1, 2, 2, 2,3,3,3,4,4,4),
number=c(0,1,1,1,1,1,0,1,0,1,0,1,1),
want=c(1,1,2,3,1,2,1,1,1,1,1,1,2),
want2=c(3,3,3,3,2,2,2,1,1,1,2,2,2))
Thank you!
Upvotes: 1
Views: 73
Reputation: 886938
We can do this easily with rleid
and rowid
from data.table
library(dplyr)
library(data.table)
df %>%
group_by(group) %>%
mutate(want = rowid(rleid(number)), want1 = max(want))
# A tibble: 13 x 4
# Groups: group [4]
# group number want want1
# <dbl> <dbl> <int> <int>
# 1 1 0 1 3
# 2 1 1 1 3
# 3 1 1 2 3
# 4 1 1 3 3
# 5 2 1 1 2
# 6 2 1 2 2
# 7 2 0 1 2
# 8 3 1 1 1
# 9 3 0 1 1
#10 3 1 1 1
#11 4 0 1 2
#12 4 1 1 2
#13 4 1 2 2
Or using the data.table
syntax
library(data.table)
setDT(df)[, want := rowid(rleid(number)),.(group)][, want1 := max(want), group][]
Upvotes: 2
Reputation: 39647
In base you can use ave
for getting the cumsum
per group like:
df$want <- pmax(1, ave(df$number, df$group, cumsum(df$number==0), FUN=cumsum))
df$want2 <- ave(df$want, df$group, FUN=max)
identical(df, dfwant)
#[1] TRUE
To reset when df$number==0
you can use cumsum(df$number==0)
as an additional grouping, as already shown by @Ronak-Shah, for ave
.
In case df$number
holds also other values than 0
and 1
you have to use:
df$want <- pmax(1, ave(df$number==1, df$group, cumsum(df$number==0)
, FUN=cumsum))
Upvotes: 1
Reputation: 388807
Since we want to reset when number = 0
, we group_by
group
and create a new group whenever the number is 0. We then calculate want
by taking cumsum
of occurrence of 1 within each group and want2
by taking max
in group.
library(dplyr)
df %>%
group_by(group, group1 = cumsum(number == 0)) %>%
mutate(want = cumsum(number == 1),
want = replace(want, number == 0, 1)) %>%
group_by(group) %>%
mutate(want1 = max(want)) %>%
select(-group1)
# group number want want1
# <dbl> <dbl> <dbl> <dbl>
# 1 1 0 1 3
# 2 1 1 1 3
# 3 1 1 2 3
# 4 1 1 3 3
# 5 2 1 1 2
# 6 2 1 2 2
# 7 2 0 1 2
# 8 3 1 1 1
# 9 3 0 1 1
#10 3 1 1 1
#11 4 0 1 2
#12 4 1 1 2
#13 4 1 2 2
Upvotes: 3