Reputation: 57
I have a dataframe with the id and value columns indicated below, but want to determine the Status column based on values in the value column, by id groups.
x <- data.frame(id = c(rep(1,10), rep(2,10), rep(3,10)),
serial = rep(1:10,3),
value = c(rep(1,4), rep(0,3), rep(1,3),
rep(1,4), rep(0,1), rep(-1,2), rep(1,3),
rep(c(1,0),5)),
status = c(rep("Fluctuating", 10),
rep("Fluctuating", 10),
rep("Not fluctuating", 10)))
id serial value status
1 1 1 1 Fluctuating
2 1 2 1 Fluctuating
3 1 3 1 Fluctuating
4 1 4 1 Fluctuating
5 1 5 0 Fluctuating
6 1 6 0 Fluctuating
7 1 7 0 Fluctuating
8 1 8 1 Fluctuating
9 1 9 1 Fluctuating
10 1 10 1 Fluctuating
11 2 1 1 Fluctuating
12 2 2 1 Fluctuating
13 2 3 1 Fluctuating
14 2 4 1 Fluctuating
15 2 5 0 Fluctuating
16 2 6 -1 Fluctuating
17 2 7 -1 Fluctuating
18 2 8 1 Fluctuating
19 2 9 1 Fluctuating
20 2 10 1 Fluctuating
21 3 1 1 Not fluctuating
22 3 2 0 Not fluctuating
23 3 3 1 Not fluctuating
24 3 4 0 Not fluctuating
25 3 5 1 Not fluctuating
26 3 6 0 Not fluctuating
27 3 7 1 Not fluctuating
28 3 8 0 Not fluctuating
29 3 9 1 Not fluctuating
30 3 10 0 Not fluctuating
Here, a group is considered to be fluctuating if three or more 1s is followed by 3 or more (0s or -1s), followed by 3 or more 1s again. It would also be considered fluctuating if three or more alternating 0s-1s-0s, -1s-0s-1s, etc.
Wondering what the best way is to assign the status column, preferably using dplyr
?
Thanks!
Upvotes: 1
Views: 91
Reputation: 160607
library(dplyr)
# library(zoo) # rollapply
threes <- function(z, minlen = 3L, ptn = c(TRUE, FALSE, TRUE)) {
r <- rle(z > 0)
starts <- zoo::rollapply(r$lengths >= minlen, minlen, all, fill = FALSE, align = "left")
for (st in which(starts)) {
if (all(r$values[st + seq_len(minlen) - 1L] == ptn)) return(TRUE)
}
return(FALSE)
}
x %>%
group_by(id) %>%
mutate(status2 = paste0(if (threes(value)) "" else "Not ", "Fluctuating")) %>%
ungroup() %>%
print(n = 99)
# # A tibble: 30 x 5
# id serial value status status2
# <dbl> <int> <dbl> <chr> <chr>
# 1 1 1 1 Fluctuating Fluctuating
# 2 1 2 1 Fluctuating Fluctuating
# 3 1 3 1 Fluctuating Fluctuating
# 4 1 4 1 Fluctuating Fluctuating
# 5 1 5 0 Fluctuating Fluctuating
# 6 1 6 0 Fluctuating Fluctuating
# 7 1 7 0 Fluctuating Fluctuating
# 8 1 8 1 Fluctuating Fluctuating
# 9 1 9 1 Fluctuating Fluctuating
# 10 1 10 1 Fluctuating Fluctuating
# 11 2 1 1 Fluctuating Fluctuating
# 12 2 2 1 Fluctuating Fluctuating
# 13 2 3 1 Fluctuating Fluctuating
# 14 2 4 1 Fluctuating Fluctuating
# 15 2 5 0 Fluctuating Fluctuating
# 16 2 6 -1 Fluctuating Fluctuating
# 17 2 7 -1 Fluctuating Fluctuating
# 18 2 8 1 Fluctuating Fluctuating
# 19 2 9 1 Fluctuating Fluctuating
# 20 2 10 1 Fluctuating Fluctuating
# 21 3 1 1 Not fluctuating Not Fluctuating
# 22 3 2 0 Not fluctuating Not Fluctuating
# 23 3 3 1 Not fluctuating Not Fluctuating
# 24 3 4 0 Not fluctuating Not Fluctuating
# 25 3 5 1 Not fluctuating Not Fluctuating
# 26 3 6 0 Not fluctuating Not Fluctuating
# 27 3 7 1 Not fluctuating Not Fluctuating
# 28 3 8 0 Not fluctuating Not Fluctuating
# 29 3 9 1 Not fluctuating Not Fluctuating
# 30 3 10 0 Not fluctuating Not Fluctuating
Upvotes: 1
Reputation: 9257
Using rle
function and dplyr
library
x %>%
mutate(value_new = ifelse(value == -1, 0, value)) %>%
group_by(id) %>%
mutate(status = ifelse(all(rle(value_new)$lengths >= 3), "Fluctuating", "Not fluctuating")) %>%
select(-value_new)
Output
# A tibble: 30 x 4
# Groups: id [3]
id serial value status
<dbl> <int> <dbl> <chr>
1 1 1 1 Fluctuating
2 1 2 1 Fluctuating
3 1 3 1 Fluctuating
4 1 4 1 Fluctuating
5 1 5 0 Fluctuating
6 1 6 0 Fluctuating
7 1 7 0 Fluctuating
8 1 8 1 Fluctuating
9 1 9 1 Fluctuating
10 1 10 1 Fluctuating
11 2 1 1 Fluctuating
12 2 2 1 Fluctuating
13 2 3 1 Fluctuating
14 2 4 1 Fluctuating
15 2 5 0 Fluctuating
16 2 6 -1 Fluctuating
17 2 7 -1 Fluctuating
18 2 8 1 Fluctuating
19 2 9 1 Fluctuating
20 2 10 1 Fluctuating
21 3 1 1 Not fluctuating
22 3 2 0 Not fluctuating
23 3 3 1 Not fluctuating
24 3 4 0 Not fluctuating
25 3 5 1 Not fluctuating
26 3 6 0 Not fluctuating
27 3 7 1 Not fluctuating
28 3 8 0 Not fluctuating
29 3 9 1 Not fluctuating
30 3 10 0 Not fluctuating
Upvotes: 1