Christopher Davis
Christopher Davis

Reputation: 95

pairwise comparison formulas in R

I am new to R and need to do pairwise comparison formulas across a set of variables. The number of elements to be compared will by dynamic but here is a hardcoded example with 4 elements, each compared against the other:

#there are 4 choices A, B, C, D - 
#they are compared against each other and comparisons are stored:
df1 <- data.frame("A" = c(80),"B" = c(20))
df2 <- data.frame("A" = c(90),"C" = c(10))
df3 <- data.frame("A" = c(95), "D" = c(5))
df4 <- data.frame("B" = c(80), "C" = c(20))
df5 <- data.frame("B" = c(90), "D" = c(10))
df6 <- data.frame("C" = c(80), "D" = c(20))

#show the different comparisons in a matrix
matrixA <- matrix(c("", df1$B[1], df2$C[1], df3$D[1],
                df1$A[1],     "", df4$C[1], df5$D[1],
                df2$A[1], df4$B[1],     "", df6$D[1],
                df3$A[1], df5$B[1], df6$C[1],    ""),
              nrow=4,ncol = 4,byrow = TRUE)
dimnames(matrixA) = list(c("A","B","C","D"),c("A","B","C","D"))

#perform calculations on the comparisons
matrixB <- matrix(
      c(1,              df1$B[1]/df1$A[1], df2$C[1]/df2$A[1], df3$D[1]/df3$A[1], 
        df1$A[1]/df1$B[1],              1, df4$C[1]/df4$B[1], df5$D[1]/df5$B[1],
        df2$A[1]/df2$C[1], df4$B[1]/df4$C[1],              1, df6$D[1]/df6$C[1],
        df3$A[1]/df3$D[1], df5$B[1]/df5$D[1], df6$C[1]/df6$D[1],         1),
              nrow = 4, ncol = 4, byrow = TRUE)
matrixB <- rbind(matrixB, colSums(matrixB)) #add the sum of the colums
dimnames(matrixB) = list(c("A","B","C","D","Sum"),c("A","B","C","D"))

#so some more calculations that I'll use later on
dfC <- data.frame("AB" = c(matrixB["A","A"] / matrixB["A","B"], 
                        matrixB["B","A"] / matrixB["B","B"],
                        matrixB["C","A"] / matrixB["C","B"],
                        matrixB["D","A"] / matrixB["D","B"]),
              "BC" = c(matrixB["A","B"] / matrixB["A","C"],
                        matrixB["B","B"] / matrixB["B","C"],
                        matrixB["C","B"] / matrixB["C","C"],
                        matrixB["D","B"] / matrixB["D","C"]
                        ), 
              "CD" = c(matrixB["A","C"] / matrixB["A","D"],
                        matrixB["B","C"] / matrixB["B","D"],
                        matrixB["C","C"] / matrixB["C","D"],
                        matrixB["D","C"] / matrixB["D","D"]))

dfCMeans <- colMeans(dfC)

#create the normalization matrix
matrixN <- matrix(c(
  matrixB["A","A"] / matrixB["Sum","A"], matrixB["A","B"] / matrixB["Sum","B"], matrixB["A","C"] / matrixB["Sum","C"], matrixB["A","D"] / matrixB["Sum","D"],
  matrixB["B","A"] / matrixB["Sum","A"], matrixB["B","B"] / matrixB["Sum","B"], matrixB["B","C"] / matrixB["Sum","C"], matrixB["B","D"] / matrixB["Sum","D"],
  matrixB["C","A"] / matrixB["Sum","A"], matrixB["C","B"] / matrixB["Sum","B"], matrixB["C","C"] / matrixB["Sum","C"], matrixB["C","D"] / matrixB["Sum","D"],
  matrixB["D","A"] / matrixB["Sum","A"], matrixB["D","B"] / matrixB["Sum","B"],     matrixB["D","C"] / matrixB["Sum","C"], matrixB["D","D"] / matrixB["Sum","D"]
  ), nrow = 4, ncol = 4, byrow = TRUE)

Since R is so concise it seems like there should be a much better way to do this, I would like to know an easier way to figure out these type of calculations using R.

Upvotes: 0

Views: 386

Answers (1)

AkselA
AkselA

Reputation: 8836

OK, I might be starting to piece together something here.

We start with a matrix like so:

A <- structure(
  c(NA, 20, 10, 5, 80, NA, 20, 10, 90, 80, NA, 20, 95, 90, 80, NA),
  .Dim = c(4, 4),
  .Dimnames = list(LETTERS[1:4], LETTERS[1:4]))

A
#    A  B  C  D
# A NA 80 90 95
# B 20 NA 80 90
# C 10 20 NA 80
# D  5 10 20 NA

This matrix is the result of a pairwise comparison on a vector of length 4. We know nothing of this vector, and the only thing we know about the function used in the comparison is that it is binary non-commutative, or more precisely: f(x, y) = 100 - f(y, x) and the result is ∈ [0, 100].

matrixB appears to be simply matrixA divided by its own transpose:

B = ATA-1

or if you prefer:

B = (100 - A) / A

Potato patato due to above mentioned properties.

B <- (100 - A) / A
B <- t(A) / A

# fill in the diagonal with 1s
diag(B) <- 1

round(B, 2)
#    A    B    C    D
# A  1 0.25 0.11 0.05
# B  4 1.00 0.25 0.11
# C  9 4.00 1.00 0.25
# D 19 9.00 4.00 1.00

The 'normalized' matrix as you call it seems to be simply each column divided by its sum.

B.norm <- t(t(B) / colSums(B))

round(B.norm, 3)
#       A     B     C     D
# A 0.030 0.018 0.021 0.037
# B 0.121 0.070 0.047 0.079
# C 0.273 0.281 0.187 0.177
# D 0.576 0.632 0.746 0.707

Upvotes: 1

Related Questions