Reputation: 127
I have the following table:
avg m1 m2
3 gl fr
1 fr tr
2 tr fr
4 la gl
5 na fr
6 tr tr
4 tr fr
I would like to get a matrix having all the entries of m1 and m2 as row- and column names, in a specific order. It also should be symmetric. The entries in the matrix are average values of all combinations of m1 and m2. So the 2.3 in the table below is the average of fr-tr, tr-fr and tr-fr.
gl tr la na fr
gl nan nan 4 nan 3
tr na 6 na nan 2.3
la 4 nan nan nan nan
na nan nan nan nan 5
fr 3 2.3 nan 5 nan
What I tried sofar:
matrix <- acast(all, m1~m2, value.var="avg",fun.aggregate = mean)
matrix<-subset(matrix, select=c(gl,tr,la,na,fr))
matrix<-t(matrix)
matrix<-subset(matrix, select=c(gl,tr,la,na,fr))
matrix<-t(matrix)
That doesn't quite do what I need. It is not symmetric and it makes a distinction between the order of m1 and m2.
Also, for a symmetric matrix, how would I make it triangular?
Upvotes: 1
Views: 67
Reputation: 160827
Your data:
df <- read.table(text='avg m1 m2
3 gl fr
1 fr tr
2 tr fr
4 la gl
5 na fr
6 tr tr
4 tr fr', header=T, stringsAsFactors=F)
Your names, ordered:
nm <- c('gl', 'tr', 'la', 'na', 'fr')
The matrix to store the results:
m <- matrix(nrow = length(nm), ncol = length(nm),
dimnames = list(nm, nm))
First, since your desired result is symmetric, we aren't concerned about the ordering of m1
versus m2
. By sorting them (m1
is always "less than" m2
), we make the aggregation simpler:
df[,c('m1','m2')] <- t(apply(df[,c('m1','m2')], 1, sort))
Though often overlooked, this is a classic example for using aggregate
:
df2 <- aggregate(avg ~ m1 + m2, df, mean)
for (i in 1:nrow(df2)) {
m[ df2$m1[i], df2$m2[i] ] <- m[ df2$m2[i], df2$m1[i] ] <- df2$avg[i]
}
m
# gl tr la na fr
# gl NA NA 4 NA 3.000000
# tr NA 6.000000 NA NA 2.333333
# la 4 NA NA NA NA
# na NA NA NA NA 5.000000
# fr 3 2.333333 NA 5 NA
Upvotes: 2