user13267770
user13267770

Reputation:

Multiple conditions (breaks) for cut function

I have the following data frame:

df <- as_tibble(c(15:-15))

Now i want to add intervals in a new column. The intervals should be like this:

df$Intervall <- cut(df$value, seq(from = -15, to = 15, by = 5), include.lowest=TRUE)

   value Intervall
1     15   (10,15]
2     14   (10,15]
3     13   (10,15]
4     12   (10,15]
5     11   (10,15]
6     10    (5,10]
7      9    (5,10]
8      8    (5,10]
9      7    (5,10]
10     6    (5,10]
11     5     (0,5]
12     4     (0,5]
13     3     (0,5]
14     2     (0,5]
15     1     (0,5]
16     0    (-5,0]
17    -1    (-5,0]
18    -2    (-5,0]
19    -3    (-5,0]
20    -4    (-5,0]
21    -5  (-10,-5]
22    -6  (-10,-5]
23    -7  (-10,-5]
24    -8  (-10,-5]
25    -9  (-10,-5]
26   -10 [-15,-10]
27   -11 [-15,-10]
28   -12 [-15,-10]
29   -13 [-15,-10]
30   -14 [-15,-10]
31   -15 [-15,-10]

The result has the Problem, that all the zeros (0) in value go to intervall -0:-5 which makes it lose symmetry between positive and negative values. 5 is interval 0,5 but -5 is interval -10:-5.

I want every 0 to be intervall 0 or 0:0, so i have symmetry and -5 is interval 0:-5.

The result should look like this:

   value Intervall
1     15   (10,15]
2     14   (10,15]
3     13   (10,15]
4     12   (10,15]
5     11   (10,15]
6     10    (5,10]
7      9    (5,10]
8      8    (5,10]
9      7    (5,10]
10     6    (5,10]
11     5     (0,5]
12     4     (0,5]
13     3     (0,5]
14     2     (0,5]
15     1     (0,5]
16     0    (0,0]
17    -1    (-5,0]
18    -2    (-5,0]
19    -3    (-5,0]
20    -4    (-5,0]
21    -5    (-5,0]
22    -6  (-10,-5]
23    -7  (-10,-5]
24    -8  (-10,-5]
25    -9  (-10,-5]
26   -10  (-10,-5]
27   -11 [-15,-10]
28   -12 [-15,-10]
29   -13 [-15,-10]
30   -14 [-15,-10]
31   -15 [-15,-10]

jay.sf solution works for this small sample data frame. It's not practical for my real data frame which is bigger. So i really want to cut and add exceptions instead of doing it by hand. I am not sure, how that is possible though.

Upvotes: 2

Views: 840

Answers (2)

GKi
GKi

Reputation: 39657

Mabe subtracting and adding 0.5 around 0 could be usable for you.

cut(15:-15, c(seq(-15,0,5) - 0.5, 0.5 + seq(0,15,5)))
# [1] (10.5,15.5]   (10.5,15.5]   (10.5,15.5]   (10.5,15.5]   (10.5,15.5]  
# [6] (5.5,10.5]    (5.5,10.5]    (5.5,10.5]    (5.5,10.5]    (5.5,10.5]   
#[11] (0.5,5.5]     (0.5,5.5]     (0.5,5.5]     (0.5,5.5]     (0.5,5.5]    
#[16] (-0.5,0.5]    (-5.5,-0.5]   (-5.5,-0.5]   (-5.5,-0.5]   (-5.5,-0.5]  
#[21] (-5.5,-0.5]   (-10.5,-5.5]  (-10.5,-5.5]  (-10.5,-5.5]  (-10.5,-5.5] 
#[26] (-10.5,-5.5]  (-15.5,-10.5] (-15.5,-10.5] (-15.5,-10.5] (-15.5,-10.5]
#[31] (-15.5,-10.5]
#7 Levels: (-15.5,-10.5] (-10.5,-5.5] (-5.5,-0.5] (-0.5,0.5] ... (10.5,15.5]

Upvotes: 1

jay.sf
jay.sf

Reputation: 72813

You could use a trick and cut at slightly below and above zero, e.g. -0.1 and 0.1, and define custom labels=.

data.frame(x, interval=cut(x, c(-15.1, -10.1, -5.1, -.1, -.1, 5, 10, 15), 
                  labels=c("[-15,-10)", "[-10,-5)", "[-5,-1]", "0", "[1,5]", 
                           "(5,10]", "(10,15]"), include.lowest=T))
#      x  interval
# 1   15   (10,15]
# 2   14   (10,15]
# 3   13   (10,15]
# 4   12   (10,15]
# 5   11   (10,15]
# 6   10    (5,10]
# 7    9    (5,10]
# 8    8    (5,10]
# 9    7    (5,10]
# 10   6    (5,10]
# 11   5     [1,5]
# 12   4     [1,5]
# 13   3     [1,5]
# 14   2     [1,5]
# 15   1     [1,5]
# 16   0         0
# 17  -1   [-5,-1]
# 18  -2   [-5,-1]
# 19  -3   [-5,-1]
# 20  -4   [-5,-1]
# 21  -5   [-5,-1]
# 22  -6  [-10,-5)
# 23  -7  [-10,-5)
# 24  -8  [-10,-5)
# 25  -9  [-10,-5)
# 26 -10  [-10,-5)
# 27 -11 [-15,-10)
# 28 -12 [-15,-10)
# 29 -13 [-15,-10)
# 30 -14 [-15,-10)
# 31 -15 [-15,-10)

Data:

x <- 15:-15

Upvotes: 0

Related Questions