Reputation: 44708
I have a data.frame, d1, that has 7 columns, the 5th through 7th column are supposed to be numeric:
str(d1[5])
'data.frame': 871 obs. of 1 variable:
$ Latest.Assets..Mns.: num 14008 1483 11524 1081 2742 ...
is.numeric(d1[5])
[1] FALSE
as.numeric(d1[5])
Error: (list) object cannot be coerced to type 'double'
How can this be? If str identifies it as numeric, how can it not be numeric? I'm importing from CSV.
Upvotes: 3
Views: 4843
Reputation: 42942
> is.numeric_data.frame=function(x)all(sapply(x,is.numeric))
> is.numeric_data.frame(d1[[5]])
[1] TRUE
d1
is a list, hence d1[5]
is a list of length 1, and in this case contains a data.frame
. to get the data frame, use d1[[5]]
.
Even if a data frame contains numeric data, it isn't numeric itself:
> x = data.frame(1:5,6:10)
> is.numeric(x)
[1] FALSE
Individual columns in a data frame are either numeric or not numeric. For instance:
> z <- data.frame(1:5,letters[1:5])
> is.numeric(z[[1]])
[1] TRUE
> is.numeric(z[[2]])
[1] FALSE
If you want to know if ALL columns in a data frame are numeric, you can use all
and sapply
:
> sapply(z,is.numeric)
X1.5 letters.1.5.
TRUE FALSE
> all(sapply(z,is.numeric))
[1] FALSE
> all(sapply(x,is.numeric))
[1] TRUE
You can wrap this all up in a convenient function:
> is.numeric_data.frame=function(x)all(sapply(x,is.numeric))
> is.numeric_data.frame(d1[[5]])
[1] TRUE
Upvotes: 4
Reputation: 100204
It may be a list (based on the error message). Have you tried class(d1[5])
? If it's a list, then you would expect either d1[[5]]
or d1[5][[1]]
to be numeric.
Edit:
Given that d1[5] is itself a data frame, you need to treat it as such. Something like this should work:
is.numeric(d1[5][,1])
Upvotes: 2
Reputation: 60756
d1[5] is not a single value. It's a vector (possibly a list?) of values. If you grab a single value I bet it is numeric. For example:
is.numeric(d1[5][[1]])
as.numeric(d1[5][[1]])
So I think the confusion is between the column object and the elements in the column. R makes a distinction between those two ideas while other languages, like SQL, functionally assume that when discussing the column you're usually referring to the elements of the column.
This discussion of indexing from the R Language Definition doc really helped me wrap my head around how to reference items in R.
Upvotes: 2