doc
doc

Reputation: 127

Create matrix from column entries

I have the following table:

avg   m1   m2
 3    gl   fr
 1    fr   tr
 2    tr   fr
 4    la   gl
 5    na   fr
 6    tr   tr
 4    tr   fr

I would like to get a matrix having all the entries of m1 and m2 as row- and column names, in a specific order. It also should be symmetric. The entries in the matrix are average values of all combinations of m1 and m2. So the 2.3 in the table below is the average of fr-tr, tr-fr and tr-fr.

      gl   tr   la   na   fr
gl    nan  nan  4    nan  3
tr    na   6    na   nan  2.3
la    4    nan  nan  nan  nan  
na    nan  nan  nan  nan  5
fr    3    2.3  nan  5    nan  

What I tried sofar:

matrix <- acast(all, m1~m2, value.var="avg",fun.aggregate = mean)
matrix<-subset(matrix, select=c(gl,tr,la,na,fr))
matrix<-t(matrix)
matrix<-subset(matrix, select=c(gl,tr,la,na,fr))
matrix<-t(matrix)

That doesn't quite do what I need. It is not symmetric and it makes a distinction between the order of m1 and m2.

Also, for a symmetric matrix, how would I make it triangular?

Upvotes: 1

Views: 67

Answers (1)

r2evans
r2evans

Reputation: 160827

Your data:

df <- read.table(text='avg   m1   m2
 3    gl   fr
 1    fr   tr
 2    tr   fr
 4    la   gl
 5    na   fr
 6    tr   tr
 4    tr   fr', header=T, stringsAsFactors=F)

Your names, ordered:

nm <- c('gl', 'tr', 'la', 'na', 'fr')

The matrix to store the results:

m <- matrix(nrow = length(nm), ncol = length(nm),
            dimnames = list(nm, nm))

First, since your desired result is symmetric, we aren't concerned about the ordering of m1 versus m2. By sorting them (m1 is always "less than" m2), we make the aggregation simpler:

df[,c('m1','m2')] <- t(apply(df[,c('m1','m2')], 1, sort))

Though often overlooked, this is a classic example for using aggregate:

df2 <- aggregate(avg ~ m1 + m2, df, mean)
for (i in 1:nrow(df2)) {
  m[ df2$m1[i], df2$m2[i] ] <- m[ df2$m2[i], df2$m1[i] ] <- df2$avg[i]
}
m
#    gl       tr la na       fr
# gl NA       NA  4 NA 3.000000
# tr NA 6.000000 NA NA 2.333333
# la  4       NA NA NA       NA
# na NA       NA NA NA 5.000000
# fr  3 2.333333 NA  5       NA

Upvotes: 2

Related Questions