Feyzi Bagirov
Feyzi Bagirov

Reputation: 1372

Error when converting with as.numeric() in R

I have a dataset:

 > x
    Treatment X1 X2
1         T1  6  7
2         T1  5  9
3         T1  8  6
4         T1  4  9
5         T1  7  9
6         T2  3  3
7         T2  1  6
8         T2  2  3
9         T3  2  3
10        T3  5  1
11        T3  3  1
12        T3  2  3

I am trying to find means of the columns X1 and X2. If I run the data as-is, I get an error:

> t1 <- subset(x[2:3], x$Treatment=="T1")
> x_vec <- colMeans(t1, na.rm = TRUE)
Error in colMeans(t1, na.rm = TRUE) : 'x' must be numeric

So, I need to convert X1 and X2 to numeric:

t1$X1 <- as.numeric(as.factor(t1$X1))
t1$X2 <- as.numeric(as.factor(t1$X2))
x_vec <- colMeans(t1, na.rm = TRUE)

But when I do that, I get the wrong result:

> x_vec
 X1  X2 
6.0 4.4 

The t1, after conversion to as.numeric(), shows:

> t1
  X1 X2
1  6  4
2  5  5
3  8  3
4  4  5
5  7  5

Why are the values in X2 changed after converting to numeric?

Upvotes: 0

Views: 3595

Answers (1)

mikeck
mikeck

Reputation: 3776

This is a pretty common issue that newer R users hit. The issue is your use of as.factor. running as.numeric on a factor converts the value to the numeric index of the label, rather than converting the label itself to a number. Your can either remove the call to as.factor or run as.character on the factor before calling as.numeric.

Note that some functions like as.data.frame automatically convert characters to factors, which can cause problems. Check out the option stringsAsFactors for more info.

Upvotes: 2

Related Questions