Randall
Randall

Reputation: 109

How to create "blocks" based on column values by group?

I have a data frame looks like this:

data = data.frame(userID = c("a","a","a","a","a","a","a","a","a","b","b"), 
                 diff = c(1,1,1,81,1,1,1,2,1,1,1)
)

Eventually, I want to get something like this:

data = data.frame(userID = c("a","a","a","a","a","a","a","a","a","b","b"), 
                  diff = c(1,1,1,81,1,1,1,2,1,1,1),
                  block = c(1,1,1,2,2,2,2,3,3,1,1)
)

So bascially, what I want to do is that everytime the value in diff column is greater than 1, a new block is created. And I want to do this by group, i.e. userID.

Right now I am thinking about using LOCF or writing a loop, but it does not seem to work. Any advice? Thanks!

Upvotes: 0

Views: 264

Answers (2)

akrun
akrun

Reputation: 887941

An option would be to group by 'userID' and then take the cumulative sum of the logical expression (diff > 1)

library(dplyr)
data %>% 
   group_by(userID) %>% 
   mutate(block = 1 + cumsum(diff > 1))
# A tibble: 11 x 3
# Groups:   userID [2]
#   userID  diff block
#   <fct>  <dbl> <dbl>
# 1 a          1     1
# 2 a          1     1
# 3 a          1     1
3 4 a         81     2
# 5 a          1     2
3 6 a          1     2
# 7 a          1     2
# 8 a          2     3
# 9 a          1     3
#10 b          1     1
#11 b          1     1

Upvotes: 1

GKi
GKi

Reputation: 39727

In base you can use ave like:

data$block <- ave(data$diff>1, data$userID, FUN=cumsum)+1
#   userID diff block
#1       a    1     1
#2       a    1     1
#3       a    1     1
#4       a   81     2
#5       a    1     2
#6       a    1     2
#7       a    1     2
#8       a    2     3
#9       a    1     3
#10      b    1     1
#11      b    1     1

Upvotes: 1

Related Questions