Davis
Davis

Reputation: 508

How to take advantage of R vectorization when using conditional statements? (alternative to looping)

I’m trying to group data based on several conditions in my data frame. At the moment I’m doing this with a utility function which I then loop through but since R is vectorised I’m wondering whether there is a more R like way to do this?

Items.Ordered <- CMdata$Items.Ordered

orderGroup <- function(Items.Ordered) {
  Items.Ordered <- as.numeric(Items.Ordered)

  if (CMdata$Items.Ordered == 0) {
    return ("NONE")
  } else if (CMdata$Items.Ordered > 0 & CMdata$Items.Ordered <= 3) {
    return ("SMALL")
  } else if (CMdata$Items.Ordered > 3 & CMdata$Items.Ordered <= 8) {
    return ("MEDIUM")
  } else if (CMdata$Items.Ordered > 8) {
    return ("LARGE")
  } else {
    return ("OTHER")
  }
}


Order.Type <- NULL
for (i in 1:nrow(CMdata)) {
  Order.Type <- c(Order.Type, orderGroup(CMdata[i,"Items.Orderd"]))
}
CMdata$Order.Type <- as.factor(Order.Type)

Upvotes: 1

Views: 87

Answers (2)

akuiper
akuiper

Reputation: 214957

One possible solution is to do a cut on your column and then relabel the factor based on the range where each value falls into. For example:

Suppose your CMdata contains a column as follows:

CMdata
   Items.Ordered
1             NA
2              0
3              1
4              2
5              3
6              4
7              5
8              6
9              7
10             8
11             9
12            10
13            NA

You can cut and factor it based on your conditions:

CMdata$Order.Type <- factor(cut(CMdata$Items.Ordered, breaks = c(-Inf, 0, 3, 8, Inf)), 
                            exclude = NULL, 
                            labels = c("NONE", "SMALL", "MEDIUM", "LARGE", "OTHER")) 
CMdata
   Items.Ordered Order.Type
1             NA      OTHER
2              0       NONE
3              1      SMALL
4              2      SMALL
5              3      SMALL
6              4     MEDIUM
7              5     MEDIUM
8              6     MEDIUM
9              7     MEDIUM
10             8     MEDIUM
11             9      LARGE
12            10      LARGE
13            NA      OTHER

Upvotes: 0

janos
janos

Reputation: 124656

I don't think your program works as intended. You pass single values to the orderGroup function, not a vector, and the conditions wouldn't work with vectors anyway.

I think what you really meant was this:

orderGroup <- function(value) {
  if (value == 0) {
    "NONE"
  } else if (value > 0 & value <= 3) {
    "SMALL"
  } else if (value > 3 & value <= 8) {
    "MEDIUM"
  } else if (value > 8) {
    "LARGE"
  } else {
    "OTHER"
  }
}

And to make this more functional, instead of the loop, you can use sapply, like this:

CMdata$Order.Type <- as.factor(sapply(CMdata$Items.Ordered, orderGroup))

Upvotes: 1

Related Questions