justinvf
justinvf

Reputation: 3129

Convertic non-numeric factor to numeric column with mapping in R

I have a factor in a data frame with levels like hot, warm, tepid, cold, very cold, freezing. I want to map them to an integer column with values in the range [-2, 2] for regression, with some values mapping to the same thing. I want to be able to specify the explicit mapping, so that very hot words map to 2, very cold words to -2, etc. How do I do this cleanly? I would love a function that I just pass some named list to, or something.

Upvotes: 8

Views: 9754

Answers (2)

Leo
Leo

Reputation: 2072

Assume a factor vector x holds the categories.

temperatures <- c("hot", "warm", "tepid", "cold", "very cold", "freezing")
set.seed(1)
x <- as.factor(sample(temperatures, 10, replace=TRUE))
x
[1] warm     tepid    cold     freezing warm     freezing freezing cold    
[9] cold     hot     
Levels: cold freezing hot tepid warm

Create a numeric vector temp.map with the mapping. Note that "hot" and "warm" map to the same value below.

temp.map <- c("hot"=2, "warm"=2, "tepid"=1, "cold"=0, "very cold"=-1, "freezing"=-1)    
y <- temp.map[as.character(x)]
y
warm    tepid     cold freezing     warm freezing freezing     cold 
   2        1        0       -1        2       -1       -1        0 
cold      hot 
   0        2 

Upvotes: 17

nico
nico

Reputation: 51680

A factor can easily be converted to an integer using as.integer.

For instance:

>temperatures <- c("Hot", "Warm", "Tiepid", "Cold", "Very cold", "Freezing")
> set.seed(12345)
> a <- sample(temperatures, 10, r=T)
> a <- factor(a, levels = temperatures)
> a
 [1] Very cold Freezing  Very cold Freezing  Tiepid    Hot       Warm     
 [8] Cold      Very cold Freezing 
Levels: Hot Warm Tiepid Cold Very cold Freezing
> as.integer(a)
 [1] 5 6 5 6 3 1 2 4 5 6

If you need it in the [-2;2] range, you would just do

> as.integer(a)-3
  [1]  2  3  2  3  0 -2 -1  1  2  3

Upvotes: 8

Related Questions