Reputation: 517
Can someone please explain how as.numeric(levels(x))[x] exactly work? here x is a factor variable.(for example x<-as.factor(sample(1:5,20,replace=TRUE)) ) As much as i am able to understand is that first we are getting the levels of x (which will be character after that we are changing it to numeric. what is happening after that I am not able to get. I know this representation is same as as.numeric(as.character(x)).
Upvotes: 0
Views: 164
Reputation: 593
I always confused with R's factors. Usually, I use a perfect idea from package Rfast, the function Rfast::ufactor
. It represents a factor using its initial type.
Here is an exmple:
x <- rnorm(10)
fx<- Rfast::ufactor(x)
fx$levels # you can get the levels like this
fx$values # you can get the values like this
Fast and simple. Rfast::ufactor
is much faster than R's but I will not post any benchmark cause it doens't fit to the question.
Upvotes: 2
Reputation: 263301
R factors are vectors of integers that serve as indices into the levels character vector. So the inner part of that expression is creating a character vector. The outer part is converting the set of values: "5", "2", "4" .... etc into numeric values.
> x<-as.factor(sample(1:5,20,replace=TRUE))
The storage class of factor objects is integer:
> dput (x)
structure(c(4L, 2L, 3L, 4L, 5L, 2L, 2L, 2L, 1L, 2L, 4L, 2L, 1L,
5L, 5L, 4L, 1L, 5L, 1L, 5L), .Label = c("1", "2", "3", "4", "5"
), class = "factor")
The levels() function returns the .Label
attribute of a factor, and when a factor is used as an index, it gets handled as an integer:
> levels(x)[x]
[1] "4" "2" "3" "4" "5" "2" "2" "2" "1" "2" "4" "2" "1" "5" "5" "4" "1" "5" "1" "5"
This method of conversion or extractions is slightly faster than as.character(x)
, but as you have experienced, it may seem a bit cryptic if you haven't worked through what is happening "under the hood" (or "bonnet" if that's what it's called in your part of the Englrish speaking world.)
Upvotes: 2