Reputation: 4949
Hi it seems that spearman correlation should produce the same result regardless if its zscore or raw. Here are two examples.
https://stats.stackexchange.com/questions/13952/can-spearmans-correlation-be-run-on-z-scores
However for this example here the two correlation are different and I'm wondering what is going on.
df = read.csv("https://www.dropbox.com/s/jdktw9jugzm97v3/test.csv?dl=1", head=F)
cor(df[, 1], df[,2], method="spearman")
cor(scale(df[, 1]), scale(df[,2]), method="spearman")
# 0.8462699 vs 0.8905341
Interestingly pearson gives the same result. I'm wondering what I'm doing or thinking incorrectly here?
edit: so in addition I thought may be this is due to ties so I also use kendall which should handle ties however it also gives different results.
cor(as.matrix ( df[, 1] ) , as.matrix ( df[,2] ), method="kendall" )
cor(scale(as.matrix ( df[, 1] )), scale(as.matrix ( df[,2] )), method="kendall")
thanks.
Upvotes: 1
Views: 100
Reputation: 4949
Hi as mentioned above in the comments this was due to a rounding error. No one answered but I wanted to add this in case someone else stumble on a similar issue. So when I round to 15-16 digits the results are the same.
df = read.csv("https://www.dropbox.com/s/jdktw9jugzm97v3/test.csv?dl=1", head=F)
df = round(df, digits = 15)
cor(as.matrix ( df[, 1] ) , as.matrix ( df[,2] ), method="spearman" )
cor(scale(df[, 1] ), scale(df[,2] ), method="spearman")
thanks everyone for helping with this.
Upvotes: 1