Reputation: 4273
Say, I have an array three dimensions, with items as rows, items as columns, participants as third dimension and values in co-occurence counts. Notice further that each of the array "slices" (= item x item matrices) is symmetrical (because they're co-occurence counts!).
Like so:
a <- structure(c(17L, 1L, 0L, 1L, 1L, 17L, 0L, 1L, 0L, 0L, 17L, 0L, 1L, 1L, 0L, 17L, 16L, 0L, 0L, 1L, 0L, 16L, 0L, 0L, 0L, 0L, 16L, 0L, 1L, 0L, 0L, 16L, 18L, 1L, 2L, 3L, 1L, 18L, 1L, 2L, 2L, 1L, 18L, 0L, 3L, 2L, 0L, 18L), .Dim = c(4L, 4L, 3L), .Dimnames = structure(list(items = c("but-how", "encyclopedia", "alien", "comma"), items = c("but-how", "encyclopedia", "alien", "comma"), people = c("Julius", "Tashina", "Azra")), .Names = c("items", "items", "people")))
I now want the correlation coefficients matrix of participants x participants, that is, the respective coefficients for Julius
, Tashina
and Azra
.
To do that, I'd just want to correlate their respective cells in the two matrices, so for Azra
and Tashina
, I'd correlate their respective upper (or lower) triangles.
It's not obvious to me how to do this, since cor()
and friends don't accept arrays.
I can hack-do this via some apply()
and upper.tri()
action, like in the below, but I am guessing there has to be a more efficient, matrix-magical way to do this, right?
Here's the hacky way I'm doing this now. Don't laugh.
loosedat <- apply(X = a, MARGIN = c(3), FUN = function(x) {
x <- x[upper.tri(x = x, diag = FALSE)] # must kill diagonal, will otherwise inflate results
})
cor(loosedat)
Gets me what I want, but I feel dirty doing it.
Julius Tashina Azra
Julius 1.0000000 0.4472136 0.522233
Tashina 0.4472136 1.0000000 0.700649
Azra 0.5222330 0.7006490 1.000000
Upvotes: 0
Views: 107
Reputation: 73275
How about
n <- dim(a)[3L] ## number of people
m <- dim(a)[1L] ## square table dimension
id <- dimnames(a)[[3L]] ## name of people
uptri <- upper.tri(diag(m)) ## upper triangular index
loosedat <- matrix(as.numeric(a)[uptri], ncol = n, dimnames = list(NULL, id))
# Julius Tashina Azra
#[1,] 1 0 1
#[2,] 0 0 2
#[3,] 0 0 1
#[4,] 1 1 3
#[5,] 1 0 2
#[6,] 0 0 0
cor(loosedat)
# Julius Tashina Azra
#Julius 1.0000000 0.4472136 0.522233
#Tashina 0.4472136 1.0000000 0.700649
#Azra 0.5222330 0.7006490 1.000000
You can squeeze above code into a single line. But for readable demonstration I take the step-by-step approach.
Upvotes: 1