brucezepplin
brucezepplin

Reputation: 9752

calculate frequency or percentage matrix in R

if I have the following:

mm <- matrix(0, 4, 3)
mm<-apply(mm, c(1, 2), function(x) sample(c(0, 1), 1))

> mm
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    0
[3,]    0    0    0
[4,]    1    0    1

How do I output a matrix that expresses the frequency or percentage of different columns where both values = 1. For example - there are two rows out of 4 where column 1 and column 2 both equal 1 (=0.5) and 1 row out of 4 where column 2 and column 3 = 1 (=0.25), so in this case I'd need:

     [,1]      [,2]      [,3]
[1,]    1      0.5       0.5
[2,]    0.5    1         0.25
[3,]    0.5    0.25      1

I am not interested in comparing the same columns, so by default the diagonal remains at 1.

I thought I may get somewhere with cor(mm) where there may be a way to output co-frequencies or co-percentages instead of correlation coefficients but this appears to not be the case. But the dimensions of the final output should be an N by N column matrix as cor() outputs:

> cor(mm)
          [,1]      [,2]      [,3]
[1,] 1.0000000 0.5773503 0.5773503
[2,] 0.5773503 1.0000000 0.0000000
[3,] 0.5773503 0.0000000 1.0000000

but obviously these are correlation coefficients, I just want to co-frequencies or co-percentages instead.

Upvotes: 2

Views: 971

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101014

A base R solution is using crossprod, i.e.,

r <- `diag<-`(crossprod(mm)/nrow(mm),1)

such that

> r
     [,1] [,2] [,3]
[1,]  1.0 0.50 0.50
[2,]  0.5 1.00 0.25
[3,]  0.5 0.25 1.00

DATA

mm <- structure(c(1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1), .Dim = 4:3)

Upvotes: 4

r.user.05apr
r.user.05apr

Reputation: 5456

set.seed(123)

mm <- matrix(0, 4, 3)
mm<-apply(mm, c(1, 2), function(x) sample(c(0, 1), 1))

combinations <- expand.grid(1:ncol(mm), 1:ncol(mm))

matrix(unlist(Map(function(x, y) {
  if (x == y) {
    res <- 1
  } else {
    res <- sum(mm[, x] * mm[, y]) / nrow(mm)
  }
  res
}, combinations[, 1], combinations[, 2])), 3)

# [,1] [,2] [,3]
# [1,] 1.00 0.25  0.0
# [2,] 0.25 1.00  0.5
# [3,] 0.00 0.50  1.0

Upvotes: 0

Related Questions