Reputation: 5
You can see I'm a beginner at this when I'm not even able to reproduce my problem with a dummy dataset... Anyways, here goes: I want to calculate tetrachoric correlations between one grouping variable and multiple other variables. Like this:
library(psych)
set.seed(42)
n <- 16
dat <- data.frame(id=1:n,
group=c(rep("a", times=5), rep("b", times=3)),
x=sample(1:2, n, replace=TRUE),
y=sample(1:2, n, replace=TRUE),
z=sample(1:2, n, replace=TRUE))
dat
id group x y z
1 1 a 1 1 2
2 2 a 1 2 2
3 3 a 1 1 2
4 4 a 1 2 2
5 5 a 2 1 1
6 6 b 2 2 1
7 7 b 2 1 1
8 8 b 2 1 1
tetrachoric(as.matrix(dat[,c("group","y")]))
Now with this example (not with my actual dataset) I get an error which I'm unable to solve:
Error in apply(x, 2, function(x) min(x, na.rm = TRUE)) : dim(X) must have a positive length In addition: Warning messages: 1: In var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) : NAs introduced by coercion 2: In tetrachoric(as.matrix(dat[, c("group", "y")])) : Item = group had no variance and was deleted
My question is still what would be the best solution to get all the correlations with a single piece of code? Thank you for help!
Upvotes: 0
Views: 1186
Reputation: 21757
The help file for tetrachoric
says "The tetrachoric correlation is the inferred Pearson Correlation from a two x two table with the assumption of bivariate normality", so presumably you need to pass it a 2x2 table. You could write a little function that would hand the tetrachoric
the appropriate table and collect the results:
myfun <- function(x,y, ...){
tabs <- lapply(seq_along(y), function(i)table(x,y[,i]))
l <- lapply(tabs, function(x)tetrachoric(x, ...))
rho <- sapply(l, function(x)x$rho)
tau <- sapply(l, function(x)x$tau)
colnames(tau) <- colnames(y)
names(rho) <- colnames(y)
ret <- list(rho = rho ,
tau = tau)
ret
}
myfun(dat$group, dat[,c("x", "y", "z")])
# $rho
# x y z
# 0.5397901 -0.2605839 0.6200705
#
# $tau
# x y z
# a 0.3186394 0.3186394 0.2690661
# 1 0.1573107 0.1573107 -0.6045853
Upvotes: 0