Reputation: 456
I'm trying to grouping based on consecutive patterns. This is the dataset.
num col1
1 SENSOR_01
2 SENSOR_05
3 SENSOR_05, SENSOR_07
4 SENSOR_05, SENSOR_07
5 SENSOR_07
6 SENSOR_05
7 SENSOR_01, SENSOR_03
8 SENSOR_01
9 SENSOR_03
10 SENSOR_01
11 SENSOR_05
structure(list(num = 1:11, col1 = structure(c(1L, 4L, 5L, 5L, 6L, 4L, 2L, 1L, 3L, 1L, 4L), .Label = c("SENSOR_01", "SENSOR_01, SENSOR_03", "SENSOR_03", "SENSOR_05", "SENSOR_05, SENSOR_07", "SENSOR_07" ), class = "factor")), class = "data.frame", row.names = c(NA, -11L))
If the row repeatably includes SENSOR_05 and SENSOR_07, it should be grouped. SENSOR_01 and SENSOR_03 set is similar. Here is my expected table (group field).
num col1 group
1 SENSOR_01 1
2 SENSOR_05 2
3 SENSOR_05, SENSOR_07 2
4 SENSOR_05, SENSOR_07 2
5 SENSOR_07 2
6 SENSOR_05 2
7 SENSOR_01, SENSOR_03 3
8 SENSOR_01 3
9 SENSOR_03 3
10 SENSOR_01 3
11 SENSOR_05 4
This is my code, but It doesn't work well.
g1 <- c("SENSOR_05", "SENSOR_07")
g2 <- c("SENSOR_01", "SENSOR_03")
test %>%
group_by(group = cumsum(col1 %in% (rep(c(g1, g2)))))
Upvotes: 2
Views: 142
Reputation: 13135
library(dplyr)
df %>% mutate(flag=case_when(grepl(paste(g1,collapse = '|'),col1)~1,
grepl(paste(g2,collapse = '|'),col1)~2,
TRUE~3),
group=data.table::rleid(flag))
num col1 flag group
1 1 SENSOR_01 2 1
2 2 SENSOR_05 1 2
3 3 SENSOR_05, SENSOR_07 1 2
4 4 SENSOR_05, SENSOR_07 1 2
5 5 SENSOR_07 1 2
6 6 SENSOR_05 1 2
7 7 SENSOR_01, SENSOR_03 2 3
8 8 SENSOR_01 2 3
9 9 SENSOR_03 2 3
10 10 SENSOR_01 2 3
11 11 SENSOR_05 1 4
PS: I used SENSOR_05 or SENSOR_07 not SENSOR_05 and SENSOR_07
Upvotes: 1