Reputation: 309
I'm beginner in R. I have learned how to check correlation between numeric data.
However I can not find details on how to check correlation between numeric and boolean type of data. Can anybody give me tips or guide me on this.
Thanks in advance!
Upvotes: 3
Views: 2642
Reputation: 81693
I suppose you are looking for the point-biserial correlation. Download the package ltm
. It includes the function biserial.cor
.
x <- rnorm(10)
y <- rep(c(0,1), 5)
library(ltm)
biserial.cor(x,y)
#[1] -0.08279833
See ?biserial.cor
for details.
The result is slightly different from the one obtained with the built-in cor
function:
cor(x,y)
#[1] 0.0872771
Upvotes: 3
Reputation: 60462
This answers your question:
##x is logical, i.e. TRUE or FALSE
R> x = sample(c(T, F), 10, replace=10)
##y is numeric
R> y = runif(10)
##When we use correlation
##R converts TRUE to 1 and FALSE to 0.
R> cor(x, y)
[1] -0.5514
The obvious question is should you be doing this? Remember, correlation is testing for a linear relationship between x and y, i.e. as x
increases y
changes in a linear manner. This doesn't occur in your scenario. As the answer by @Sven indicate, you want to use the Point-biserial correlation method.
If you data is a character vector, say:
x = c("M", "F")
then you would need to do an additional step:
x[x=="M"] = 1
x[x=="F"] = 1
Upvotes: 2