How to split data based on range

Question

I need to split the data into multiple groups based on range of values. Range here is 20 to 73.

As you can see in the image below, I need to split into such 3 different groups. You should also notice that right after a value hits 70-73 range, next value would be around 40's and then falls to 20's so the transition is gradual.

I am not concerned about the transitory values and intend to ignore them.

Sample data:

structure(list(V1 = c(27, 28, 34, 35, 47, 50, 52, 54, 55, 68, 
                      69, 73, 45, 39, 30, 21, 23, 24, 22, 26, 
                      29, 31, 32, 35, 42, 44, 46, 50, 55, 66, 
                      69, 70, 47, 40, 33, 21, 22, 29, 31, 38, 
                      47, 55, 59, 64, 66, 71)), 
          class = "data.frame", 
          row.names = c(NA, -45L))

Code I tried:

df[, ID := cumsum(V1>=73)+1]

arg0naut91 · Accepted Answer

Perhaps this would work for you:

library(dplyr)

df %>%
  group_by(groups = cumsum(coalesce(as.numeric(V1 < lag(V1) & lag(V1) >= 70), 1))) %>%
  filter(!coalesce(lead(cumsum(coalesce(as.numeric(V1 > lag(V1)), 1))), 99) == 1) %>%
  arrange(groups, V1)

Output:

   V1 groups
1  27      1
2  28      1
3  34      1
4  35      1
5  47      1
6  50      1
7  52      1
8  54      1
9  55      1
10 68      1
11 69      1
12 73      1
13 21      2
14 22      2
15 23      2
16 24      2
17 26      2
18 29      2
19 31      2
20 32      2
21 35      2
22 42      2
23 44      2
24 46      2
25 50      2
26 55      2
27 66      2
28 69      2
29 70      2
30 21      3
31 22      3
32 29      3
33 31      3
34 38      3
35 47      3
36 55      3
37 59      3
38 64      3
39 66      3
40 71      3

Data:

df <- structure(list(V1 = c(27, 28, 34, 35, 47, 50, 52, 54, 55, 68, 
69, 73, 45, 39, 30, 21, 23, 24, 22, 26, 29, 31, 32, 35, 42, 44, 
46, 50, 55, 66, 69, 70, 47, 40, 33, 21, 22, 29, 31, 38, 47, 55, 
59, 64, 66, 71)), class = "data.frame", row.names = c(NA, -46L
))

How to split data based on range

Answers (2)

Related Questions