Jun Lyu
Jun Lyu

Reputation: 21

As.numeric function with binary data in R

I recently encountered a new problem in R which I did not see before. I have a set of data with a dependent variable Accuracy which has only two values, "0" and "1". Before, I use data$Accuracy=as.numeric(data$Accuracy) to turn these two levels to numbers and it works.

This time, however, when I did the same thing. "0"s turned to "1"s and "1"s turned to "2"s. Is this due to the new changes made in R? How do I work around this issue?

Thanks!!

Upvotes: 0

Views: 1637

Answers (3)

GKi
GKi

Reputation: 39707

If it is a factor the manual recommends

as.numeric(levels(data$Accuracy))[data$Accuracy]

to transform it to approximately its original numeric values.

Upvotes: 1

I guess there could be a problem with the dataframe definition or reading from a file. If original data where only 0 and 1 data$Accuracy should be class integer. But any no numeric character in just one row will create a factor column. As example:

> zz<-data.frame(c(0, 0, 1, 1))
> zz
  c.0..0..1..1.
1             0
2             0
3             1
4             1
> zz<-data.frame(c(0, 0, 1, 1, "")) # an empty space
> zz
  c.0..0..1..1.....
1                 0
2                 0
3                 1
4                 1
5                  
> class(zz$c.0..0..1..1.....)
[1] "factor"
> zz<-data.frame(c(0, 0, 1, 1, NA)) # empty numeric data
> zz
  c.0..0..1..1..NA.
1                 0
2                 0
3                 1
4                 1
5                NA
> class(zz$c.0..0..1..1..NA.)
[1] "numeric"

Upvotes: 0

akrun
akrun

Reputation: 887501

It could be that the columns are factor class and when we use as.numeric, we get the integer storage mode values (in R, indexing starts from 1). In that case, we can convert to character and then to numeric

data$Accuracy <- as.numeric(as.character(data$Accuracy))

Upvotes: 2

Related Questions