Reputation: 3129
I have a factor in a data frame
with levels like hot
, warm
, tepid
, cold
, very cold
, freezing
. I want to map them to an integer column with values in the range [-2, 2]
for regression, with some values mapping to the same thing. I want to be able to specify the explicit mapping, so that very hot
words map to 2
, very cold
words to -2
, etc. How do I do this cleanly? I would love a function that I just pass some named list to, or something.
Upvotes: 8
Views: 9754
Reputation: 2072
Assume a factor vector x
holds the categories.
temperatures <- c("hot", "warm", "tepid", "cold", "very cold", "freezing")
set.seed(1)
x <- as.factor(sample(temperatures, 10, replace=TRUE))
x
[1] warm tepid cold freezing warm freezing freezing cold
[9] cold hot
Levels: cold freezing hot tepid warm
Create a numeric vector temp.map
with the mapping. Note that "hot" and "warm" map to the same value below.
temp.map <- c("hot"=2, "warm"=2, "tepid"=1, "cold"=0, "very cold"=-1, "freezing"=-1)
y <- temp.map[as.character(x)]
y
warm tepid cold freezing warm freezing freezing cold
2 1 0 -1 2 -1 -1 0
cold hot
0 2
Upvotes: 17
Reputation: 51680
A factor can easily be converted to an integer using as.integer
.
For instance:
>temperatures <- c("Hot", "Warm", "Tiepid", "Cold", "Very cold", "Freezing")
> set.seed(12345)
> a <- sample(temperatures, 10, r=T)
> a <- factor(a, levels = temperatures)
> a
[1] Very cold Freezing Very cold Freezing Tiepid Hot Warm
[8] Cold Very cold Freezing
Levels: Hot Warm Tiepid Cold Very cold Freezing
> as.integer(a)
[1] 5 6 5 6 3 1 2 4 5 6
If you need it in the [-2;2] range, you would just do
> as.integer(a)-3
[1] 2 3 2 3 0 -2 -1 1 2 3
Upvotes: 8