Reputation: 688
I stumble on the following problem. I have a data.frame
A <- data.frame(let = c("A", "B", "C"), x = 1:3, y = 4:6)
The classes of its columns are
sapply(A, class)
let x y
"factor" "integer" "integer"
s.numeric(A$x)
[1] TRUE
is.numeric(A)
[1] FALSE
I do not understand why although A$x
and B$x
are numeric, the data.frame
composed only by these two columns is not numeric
is.numeric(A[, c("x", "y")])
[1] FALSE
Removing the factor
column does not help...
B <- A
B$let <- NULL
is.numeric(B)
[1] FALSE
is.numeric(B$x)
[1] TRUE
is.numeric(B$y)
[1] TRUE
So, I tried creating a new dataset built only with the numeric columns in A
. Is it numeric? No...
C <- data.frame(B$x, B$y)
is.numeric(C)
[1] FALSE
C <- data.frame(as.numeric(B$x), as.numeric(B$y))
is.numeric(C)
[1] FALSE
There must be something I'm missing here. Any help?
Upvotes: 1
Views: 478
Reputation: 15897
A data frame is always a data frame, independent of the classes of its columns. So what you get is the expected behaviour
If you want to check whether all columns in a data frame are numeric, you can use the following code
all(sapply(A, is.numeric))
## [1] FALSE
all(sapply(A[, c("x", "y")], is.numeric))
## [1] TRUE
A table with only numeric data can also be understood as a matrix. You can convert the numeric columns of your data frame to a matrix as follows:
M <- as.matrix(A[, c("x", "y")])
M
## x y
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
The matrix M
is now really numeric:
is.numeric(M)
## [1] TRUE
Upvotes: 3
Reputation: 886968
We need to apply the function on the vector
and not on the data.frame
sapply(A[c("x", "y")], is.numeric)
instead of
is.numerc(A)
as according to ?is.numeric
Methods for is.numeric should only return true if the base type of the class is double or integer and values can reasonably be regarded as numeric (e.g., arithmetic on them makes sense, and comparison should be done via the base type).
The class
of 'A' is data.frame
and is not numeric
class(A)
#[1] "data.frame"
sapply(A, class)
is.numeric
returns TRUE only if the class
of the object is numeric
or integer
.
Thus, a data.frame
can never be numeric
unless we apply the is.numeric
on the vector
or the extracted column. That is the reason, we do it on a loop with lapply/sapply
where we get the column as a vector
and its class would be the class of that column
Upvotes: 3