Reputation: 821
I've dealt with R's correlation algorithm before, but I am unsure what is going on with my current code.
My input data are two .csv files. The first only has one column, and I coerced it as a data.frame. It looks like this (my data are quite long time series, so I'm only showing the first 10 data points)):
trends
V1 0.2701541
V2 2.00532
V3 1.79548
V4 0.2549123
V5 0.2124736
V6 -1.132594
V7 -0.711875
V8 -1.577067
V9 -0.5320426
V10 1.325005
My other files has several columns, and looks as follows:
X13_EVI X14_EVI X15_EVI X18_EVI
1 1.0492437 0.54155557 -0.58480284 -3.47111922
2 1.7274555 1.46141010 0.79416226 1.04050086
3 1.7274555 1.46141010 0.48772557 1.17721662
4 -0.1941446 -0.14833532 -0.12514781 0.22020630
5 -0.1941446 -0.14833532 -0.12514781 0.22020630
6 -0.5332505 -0.60826258 -0.73802119 -0.73680402
7 -0.4202152 -0.49328077 -0.12514781 -0.32665674
8 -0.9853917 -1.29815348 -1.04445787 -0.73680402
9 -0.3071799 -0.03335350 0.18128888 -0.46337250
10 0.5971025 1.00148284 1.10059895 0.63035358
When I try to do
corr=cor(trends, all.obs)
I get the error message
Error in cor(trends, all.obs) : 'x' must be numeric
I can't remember coming across this problem before and am unable to figure out what causes it. In the past I've always been able to calculate the correlation between each observed time series (the columns in all.obs) and the trend (in this case 1 trend). I've checked
> is.numeric(trends)
[1] FALSE
> is.numeric(all.obs)
[1] FALSE
> is.data.frame(all.obs)
[1] TRUE
> is.data.frame(trends)
[1] TRUE
I also did
> typeof(all.obs)
[1] "list"
> typeof(trends)
[1] "list"
because I got
> trends=as.numeric(trends)
Error: (list) object cannot be coerced to type 'double'
It's been a while since I worked with this though, so maybe I'm missing something very obvious?
Upvotes: 0
Views: 2589
Reputation: 660
Try to see if all the columns of trends
and all.obs
are stored as numeric.
To do it, run sapply(trends, is.numeric)
and sapply(all.obs, is.numeric)
. If you see any FALSE
in the output you should fix it by coercing to numeric with the help of the as.numeric()
function.
OR, a better way to avoid this kind of problem, is specifying the type of the columns when reading the csv files. You do this by using the colClasses
parameter from read.csv
function. Example:
trends <- read.csv("PATH_TO_DATA_FOLDER/trends.csv", colClasses = "numeric")
all.obs <- read.csv("PATH_TO_DATA_FOLDER/all_obs.csv", colClasses = rep("numeric", 4))
See if it is sufficient.
Upvotes: 3