Reputation: 109
I have a data frame looks like this:
data = data.frame(userID = c("a","a","a","a","a","a","a","a","a","b","b"),
diff = c(1,1,1,81,1,1,1,2,1,1,1)
)
Eventually, I want to get something like this:
data = data.frame(userID = c("a","a","a","a","a","a","a","a","a","b","b"),
diff = c(1,1,1,81,1,1,1,2,1,1,1),
block = c(1,1,1,2,2,2,2,3,3,1,1)
)
So bascially, what I want to do is that everytime the value in diff
column is greater than 1, a new block is created. And I want to do this by group, i.e. userID
.
Right now I am thinking about using LOCF
or writing a loop, but it does not seem to work. Any advice? Thanks!
Upvotes: 0
Views: 264
Reputation: 887941
An option would be to group by 'userID' and then take the cumulative sum of the logical expression (diff > 1
)
library(dplyr)
data %>%
group_by(userID) %>%
mutate(block = 1 + cumsum(diff > 1))
# A tibble: 11 x 3
# Groups: userID [2]
# userID diff block
# <fct> <dbl> <dbl>
# 1 a 1 1
# 2 a 1 1
# 3 a 1 1
3 4 a 81 2
# 5 a 1 2
3 6 a 1 2
# 7 a 1 2
# 8 a 2 3
# 9 a 1 3
#10 b 1 1
#11 b 1 1
Upvotes: 1
Reputation: 39727
In base you can use ave
like:
data$block <- ave(data$diff>1, data$userID, FUN=cumsum)+1
# userID diff block
#1 a 1 1
#2 a 1 1
#3 a 1 1
#4 a 81 2
#5 a 1 2
#6 a 1 2
#7 a 1 2
#8 a 2 3
#9 a 1 3
#10 b 1 1
#11 b 1 1
Upvotes: 1