pd441
pd441

Reputation: 2763

as.numeric changes the actual values as data which is originally a factor.

When I apply as.numeric, and also as.integer to a column, it changes the values. Why is this? e.g:

test <- data.frame(structure(c("52053,34", "79032,83", "20679,06", "20799,56", "20679,06", 
        "21279,45", "51789,44", "54189,45", "73138,89", "73138,89"), .Dim = c(10L, 
                                                                              1L)))
names(test)[names(test) == "structure.c..52053.34....79032.83....20679.06....20799.56....20679.06..."] <- "column"

test$b <- as.numeric(test$column)
test$c <- as.integer(test$column)

Upvotes: 0

Views: 2295

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269441

test$column is a factor.

class(test$column)
## [1] "factor"

levels(test$column) shows the labels of the levels of a factor.

levels(test$column)
## [1] "20679,06" "20799,56" "21279,45" "51789,44" "52053,34" "54189,45" "73138,89"
## [8] "79032,83"

The actual data values are integers: 5, 8, 1, etc.

unclass(test$column)
## [1] 5 8 1 2 1 3 4 6 7 7
## attr(,"levels")
## [1] "20679,06" "20799,56" "21279,45" "51789,44" "52053,34" "54189,45" "73138,89"
## [8] "79032,83"

The first element of test$column is represented by the integer 5 because it is the 5th level. Looking at the levels vector we see that the label of the 5th level is

levels(test$column)[5]
## [1] "52053,34"

In general, we want to get the labels of each corresponding element and convert each of those to numeric:

as.numeric(sub(",", ".", levels(test$column))[test$column])
##  [1] 52053.34 79032.83 20679.06 20799.56 20679.06 21279.45 51789.44 54189.45
##  [9] 73138.89 73138.89

Alternately try this shorter version:

as.numeric(sub(",", ".", test$column))
##  [1] 52053.34 79032.83 20679.06 20799.56 20679.06 21279.45 51789.44 54189.45
##  [9] 73138.89 73138.89

If the numbers were represented using decimal points in the first place (as opposed to commas) then this would have been sufficient where x is such a factor:

as.numeric(as.character(x))

Upvotes: 1

Related Questions