Reputation: 323
Let's have a numeric vector:
a <- round(runif(20, 1, 5), 0)
[1] 3 5 4 2 1 2 3 4 5 2
I need to assign values to these numbers using table like this:
1 to 2: assign "A"
3 to 4: assign "B"
5: assign "C"
This is very simple sample table, but there could be many thousands of numbers and tens of intervals.
I can do nested if structure to test every number to find the right interval. But I am looking for better, more vectorised solution. How to solve it efficiently?
Upvotes: 0
Views: 327
Reputation: 1107
Define a Minimum Value and Maximum Value for the variable you want assign classes, and how many classes you want, the class will be defined by splitting the range of your variable in intervals of equal length:
minValue <- 1
maxValue <- 5
numClasses <- 3
Define the breaks, this defines the start and end point of each interval:
breaks <- seq(minValue, maxValue, length.out = numClasses+1)
#[1] 1.000000 2.333333 3.666667 5.000000
Then cut your numeric vector using the function cut(), use integer labels. Use the argument include.lowest=TRUE so that the minimum value falls in the first interval:
set.seed(1)
a <- round(runif(20, 1, 5), 0)
#[1] 2 2 3 5 2 5 5 4 4 1 2 2 4 3 4 3 4 5 3 4
labels = seq(1, length(breaks)-1) #integer labels
classes <- cut(a, breaks=breaks, labels=labels, include.lowest = TRUE)
#[1] 1 1 2 3 1 3 3 3 3 1 1 1 3 2 3 2 3 3 2 3
If you want labels to be letters use the following line instead:
labels = LETTERS[1:(length(breaks)-1)]
classes <- cut(a, breaks=breaks, labels=labels, include.lowest = TRUE)
#[1] A A B C A C C C C A A A C B C B C C B C
However this limits you to 26 classes.
Upvotes: 2
Reputation: 12569
a <- c(3, 5, 4, 2, 1, 2, 3, 4, 5, 2)
cut(a, breaks=c(0.5, 2.5, 4.5, 10), labels=c("A", "B", "C"))
Upvotes: 2