Raymond Z
Raymond Z

Reputation: 21

as.factor not converting integer to factor

i'm teaching myself R right now. I'm trying to convert integer variables into categorical with the following.

train[, c("Store", "DayOfWeek")] <- apply(train[,c("Store", "DayOfWeek")], 2, as.factor)

but it's turning the variables into characters instead. can't figure out why - except possibly R coercion.

'data.frame':   1017209 obs. of  2 variables:
 $ Store        : chr  "1" "2" "3" "4" ...
 $ DayOfWeek    : chr  "5" "5" "5" "5" ...

when i do it to the vars individually (instead of using apply), it works. THanks

Upvotes: 2

Views: 7269

Answers (2)

Gopala
Gopala

Reputation: 10473

As mentioned above, lapply is the right tool. You can use dplyr and mutate_each for this task and many similar column transformations as follows:

library(dplyr)
train <- train %>% mutate_each(funs(as.factor), c(Store, DayOfWeek))

Upvotes: 0

joran
joran

Reputation: 173517

apply is the wrong tool. The "apply" way to do this is to use lapply because data frames are lists, where each column is an element of the list:

mtcars[,c('cyl','vs')] <- lapply(mtcars[,c('cyl','vs')],as.factor)
> str(mtcars)
'data.frame':   32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : Factor w/ 2 levels "0","1": 1 1 2 2 1 2 1 2 2 2 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

In general, be cautious about using apply on data frames. The very first line of the documentation of apply makes it clear that the first thing it does is coerce it's argument to a matrix and matrices can only hold data of one type. So your data frame will be instantly converted to all numbers, all integers, all characters, depending on what's in it.

Upvotes: 6

Related Questions