SriniShine
SriniShine

Reputation: 1139

R: creating a symmetric matrix with xtabs

I have three models: M1, M2 and M3. I compare the models pair-wise and get a score. I do only the one-way comparisons. M1 and M2 but not M2 and M1 as it will be he same. I want to convert these to a symmetrix matrix.

I was able to convert the data set into matrix using xtabs but it doesn't have the M1-M1 and M3-M3 distance.

d <- data.frame(M1 = c("M1", "M1", "M1", "M2", "M2", "M3"),
            M2 = c("M2", "M3", "M4", "M3", "M4", "M4"),
            C = c(1, 1, 4, 2, 2, 6))

dm = xtabs(C~M1+M2, data=d)


> d
  M1 M2 C
1 M1 M2 1
2 M1 M3 1
3 M1 M4 4
4 M2 M3 2
5 M2 M4 2
6 M3 M4 6
> dm
    M2
M1   M2 M3 M4
  M1  1  1  4
  M2  0  2  2
  M3  0  0  6

I tried copying the upper triangle to the lower triangle but it doesn't work properly as it is not a symmetric matrix. I would like to know how to include M1-M1 and M3-M3 distance and make it a symmetric matrix. Even though the distance is 0 will it be a problem when I try to convert the matrix into a dist() object?

> dm[lower.tri(dm)] <- t(dm)[lower.tri(dm)]
> dm
    M2
M1   M2 M3 M4
  M1  1  1  4
  M2  1  2  2
  M3  4  2  6

Upvotes: 2

Views: 566

Answers (2)

user20650
user20650

Reputation: 25854

To get a symmetric matrix, you likely want set the same levels (M1 to M4) on each dimension:

One way to do this is to set the variables to factors with the same set of factor levels.

d[c("M1", "M2")] <- lapply(d[c("M1", "M2")], factor, levels=unique(unlist(d[c("M1", "M2")]))) 

You can then use xtabs as before, and add the result to the transpose of the result.

dm <- xtabs(C ~ M1 + M2, data=d)

dm + t(dm)

#    M2
#M1   M1 M2 M3 M4
#  M1  0  1  1  4
#  M2  1  0  2  2
#  M3  1  2  0  6
#  M4  4  2  6  0

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 269556

Add its transpose. That gives the diagonal twice so subtract it out.

dm + t(dm) - diag(diag(dm))

giving:

    M2
M1   M2 M3 M4
  M1  1  1  4
  M2  1  2  2
  M3  4  2  6

If we know that all elements are non-negative then this would also work:

pmax(dm, t(dm))

Upvotes: 1

Related Questions