Reputation: 421
I have a data frame like so:
>df
classA classB classC classD
item1 0 0 34 6
item2 2 12 267 12
item3 45 26 3 5876
item4 23 110 674 17
item5 1 14 98 17
>class(df)
[1] "data.frame"
>typeof(df)
[1] "list"
>is.factor(df)
[1] FALSE
When I convert it to a numeric matrix (to do some operations on it), values of the first column (only) are changed.
>data.matrix(df)
classA classB classC classD
item1 1 0 34 6
item2 3 12 267 12
item3 59 26 3 5876
item4 34 110 674 17
item5 2 14 98 17
I don't get it. Where do these numbers come from? How can I convert the data frame to a numeric matrix properly?
Upvotes: 1
Views: 682
Reputation: 11981
I would guess that the first column of df
is a factor (you can check by typing is.factor(df[,1])
).
The function data.matrix returns the internal values of factors. That is why you get different numbers.
One way to circumvent this is to transform the first column into a numeric column first, or use as.matrix
instead.
Upvotes: 1
Reputation: 3221
You should use as.matrix
:
> df
ClassA ClassB ClassC ClassD
1 0 0 34 6
2 2 12 267 12
3 45 26 3 5876
4 23 110 674 17
5 1 98 98 17
> as.matrix(df)
ClassA ClassB ClassC ClassD
[1,] 0 0 34 6
[2,] 2 12 267 12
[3,] 45 26 3 5876
[4,] 23 110 674 17
[5,] 1 98 98 17
> class(as.matrix(df))
[1] "matrix"
Upvotes: 2