Reputation: 4226
I have the data.frame df
with three variables with values of "1" or "0" and no rows with more than one of the variables with a "1":
> df <- structure(list(var1 = c(0, 0, 0, 0, 1, 0, 0, 1, 0, 0), var2 = c(1,
0, 0, 0, 0, 0, 0, 0, 0, 0), var3 = c(0, 1, 0, 1, 0, 1, 0, 0,
0, 1)), .Names = c("var1", "var2", "var3"), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
> df
var1 var2 var3
1 0 1 0
2 0 0 1
3 0 0 0
4 0 0 1
5 1 0 0
6 0 0 1
7 0 0 0
8 1 0 0
9 0 0 0
10 0 0 1
The row sums are less than 1 for all of the rows:
> rowSums(df)
[1] 1 1 0 1 1 1 0 1 0 1
When I look at the correlations (I used the "spearman" argument because the data are "1"s and "0"s), the output is confusing because there are correlations that are non-zero:
cor(df, method = "spearman")
var1 var2 var3
var1 1.0000000 -0.1666667 -0.4082483
var2 -0.1666667 1.0000000 -0.2721655
var3 -0.4082483 -0.2721655 1.0000000
I wondered if this was some strange side-effect of stats::cor(), so I tried Hmisc::rcorr() with the same result:
> Hmisc::rcorr(as.matrix(df), type = "spearman")
var1 var2 var3
var1 1.00 -0.17 -0.41
var2 -0.17 1.00 -0.27
var3 -0.41 -0.27 1.00
Shouldn't the correlations between all three variables be 0 because there are no rows in which more than one variable has a value of "1"? Am I misunderstanding how correlations work in some profound way? Or am I using these functions incorrectly?
Upvotes: 0
Views: 110
Reputation: 2922
Your observation of the row sums to be all smaller than 1 actually implies that there is some negative correlation between the variables, because the meaning of negative correlation is one variable bigger (in your case 1), one variable smaller (in your case 0), which is in agree with your results.
Your confusion might arise because of the inner product of any of the two variables to be zero, but inner product to be zero doesn't mean there is no correlation (it only means there is no linear correlation only when every variable is standardized to have mean zero, which your case certainly is not).
Upvotes: 1