jik84
jik84

Reputation: 5

R: Tetrachoric correlation for multiple variables at one go?

You can see I'm a beginner at this when I'm not even able to reproduce my problem with a dummy dataset... Anyways, here goes: I want to calculate tetrachoric correlations between one grouping variable and multiple other variables. Like this:

library(psych)

set.seed(42)
n <- 16
dat <- data.frame(id=1:n,
                  group=c(rep("a", times=5), rep("b", times=3)),
                  x=sample(1:2, n, replace=TRUE),
                  y=sample(1:2, n, replace=TRUE),
                  z=sample(1:2, n, replace=TRUE))

dat

  id group x y z
1  1     a 1 1 2
2  2     a 1 2 2
3  3     a 1 1 2
4  4     a 1 2 2
5  5     a 2 1 1
6  6     b 2 2 1
7  7     b 2 1 1
8  8     b 2 1 1

tetrachoric(as.matrix(dat[,c("group","y")]))

Now with this example (not with my actual dataset) I get an error which I'm unable to solve:

Error in apply(x, 2, function(x) min(x, na.rm = TRUE)) : dim(X) must have a positive length In addition: Warning messages: 1: In var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) : NAs introduced by coercion 2: In tetrachoric(as.matrix(dat[, c("group", "y")])) : Item = group had no variance and was deleted

My question is still what would be the best solution to get all the correlations with a single piece of code? Thank you for help!

Upvotes: 0

Views: 1186

Answers (1)

DaveArmstrong
DaveArmstrong

Reputation: 21757

The help file for tetrachoric says "The tetrachoric correlation is the inferred Pearson Correlation from a two x two table with the assumption of bivariate normality", so presumably you need to pass it a 2x2 table. You could write a little function that would hand the tetrachoric the appropriate table and collect the results:

myfun <- function(x,y, ...){
  tabs <- lapply(seq_along(y), function(i)table(x,y[,i]))
  l <- lapply(tabs, function(x)tetrachoric(x, ...))
  rho <- sapply(l, function(x)x$rho)
  tau <- sapply(l, function(x)x$tau)
  colnames(tau) <- colnames(y)
  names(rho) <- colnames(y)
  ret <- list(rho = rho , 
              tau = tau)
  ret
  
}

myfun(dat$group, dat[,c("x", "y", "z")])
# $rho
#         x          y          z 
# 0.5397901 -0.2605839  0.6200705 
# 
# $tau
#           x         y          z
# a 0.3186394 0.3186394  0.2690661
# 1 0.1573107 0.1573107 -0.6045853

Upvotes: 0

Related Questions