nand
nand

Reputation: 517

Please explain working of how are we converting factor variable to numeric in R

Can someone please explain how as.numeric(levels(x))[x] exactly work? here x is a factor variable.(for example x<-as.factor(sample(1:5,20,replace=TRUE)) ) As much as i am able to understand is that first we are getting the levels of x (which will be character after that we are changing it to numeric. what is happening after that I am not able to get. I know this representation is same as as.numeric(as.character(x)).

Upvotes: 0

Views: 164

Answers (2)

Manos Papadakis
Manos Papadakis

Reputation: 593

I always confused with R's factors. Usually, I use a perfect idea from package Rfast, the function Rfast::ufactor. It represents a factor using its initial type.

Here is an exmple:

x <- rnorm(10)
fx<- Rfast::ufactor(x)
fx$levels # you can get the levels like this
fx$values # you can get the values like this

Fast and simple. Rfast::ufactor is much faster than R's but I will not post any benchmark cause it doens't fit to the question.

Upvotes: 2

IRTFM
IRTFM

Reputation: 263301

R factors are vectors of integers that serve as indices into the levels character vector. So the inner part of that expression is creating a character vector. The outer part is converting the set of values: "5", "2", "4" .... etc into numeric values.

> x<-as.factor(sample(1:5,20,replace=TRUE)) 

The storage class of factor objects is integer:

> dput (x)
structure(c(4L, 2L, 3L, 4L, 5L, 2L, 2L, 2L, 1L, 2L, 4L, 2L, 1L, 
5L, 5L, 4L, 1L, 5L, 1L, 5L), .Label = c("1", "2", "3", "4", "5"
), class = "factor")

The levels() function returns the .Label attribute of a factor, and when a factor is used as an index, it gets handled as an integer:

> levels(x)[x]
 [1] "4" "2" "3" "4" "5" "2" "2" "2" "1" "2" "4" "2" "1" "5" "5" "4" "1" "5" "1" "5"

This method of conversion or extractions is slightly faster than as.character(x), but as you have experienced, it may seem a bit cryptic if you haven't worked through what is happening "under the hood" (or "bonnet" if that's what it's called in your part of the Englrish speaking world.)

Upvotes: 2

Related Questions