Reputation: 21
I recently encountered a new problem in R which I did not see before. I have a set of data with a dependent variable Accuracy which has only two values, "0" and "1". Before, I use data$Accuracy=as.numeric(data$Accuracy) to turn these two levels to numbers and it works.
This time, however, when I did the same thing. "0"s turned to "1"s and "1"s turned to "2"s. Is this due to the new changes made in R? How do I work around this issue?
Thanks!!
Upvotes: 0
Views: 1637
Reputation: 39707
If it is a factor the manual recommends
as.numeric(levels(data$Accuracy))[data$Accuracy]
to transform it to approximately its original numeric values.
Upvotes: 1
Reputation: 306
I guess there could be a problem with the dataframe definition or reading from a file. If original data where only 0 and 1 data$Accuracy should be class integer. But any no numeric character in just one row will create a factor column. As example:
> zz<-data.frame(c(0, 0, 1, 1))
> zz
c.0..0..1..1.
1 0
2 0
3 1
4 1
> zz<-data.frame(c(0, 0, 1, 1, "")) # an empty space
> zz
c.0..0..1..1.....
1 0
2 0
3 1
4 1
5
> class(zz$c.0..0..1..1.....)
[1] "factor"
> zz<-data.frame(c(0, 0, 1, 1, NA)) # empty numeric data
> zz
c.0..0..1..1..NA.
1 0
2 0
3 1
4 1
5 NA
> class(zz$c.0..0..1..1..NA.)
[1] "numeric"
Upvotes: 0
Reputation: 887501
It could be that the columns are factor
class and when we use as.numeric
, we get the integer storage mode values (in R, indexing starts from 1). In that case, we can convert to character
and then to numeric
data$Accuracy <- as.numeric(as.character(data$Accuracy))
Upvotes: 2