Reputation: 5719
I have a distance matrix called mydist
. I want to extract the lower triangle of the matrix into pairwise combination of column values. For example:
sampleA sampleB values
S05-F13-P01_C S05-F13-P01_C 2251
S08-F10-P01_C S08-F10-P01_C 2246
. . so on
Data
mydist<-structure(c("2251", "1923", "2085", "1954", "2105", "0", "2246",
"2094", "1955", "2127", "0", "0", "2521", "2110", "2329", "0",
"0", "0", "2276", "2141", "0", "0", "0", "0", "2561"), .Dim = c(5L,
5L), .Dimnames = list(c("S05-F13-P01_C", "S08-F10-P01_C", "S08-F11-P01_C",
"S09-F66-P01_C", "S09-F67-P01_C"), c("S05-F13-P01_C", "S08-F10-P01_C",
"S08-F11-P01_C", "S09-F66-P01_C", "S09-F67-P01_C")))
Upvotes: 3
Views: 1086
Reputation: 193667
I would consider the following:
data.frame(as.table(mydist))[lower.tri(mydist, diag = TRUE), ]
## Var1 Var2 Freq
## 1 S05-F13-P01_C S05-F13-P01_C 2251
## 2 S08-F10-P01_C S05-F13-P01_C 1923
## 3 S08-F11-P01_C S05-F13-P01_C 2085
## 4 S09-F66-P01_C S05-F13-P01_C 1954
## 5 S09-F67-P01_C S05-F13-P01_C 2105
## 7 S08-F10-P01_C S08-F10-P01_C 2246
## 8 S08-F11-P01_C S08-F10-P01_C 2094
## 9 S09-F66-P01_C S08-F10-P01_C 1955
## 10 S09-F67-P01_C S08-F10-P01_C 2127
## 13 S08-F11-P01_C S08-F11-P01_C 2521
## 14 S09-F66-P01_C S08-F11-P01_C 2110
## 15 S09-F67-P01_C S08-F11-P01_C 2329
## 19 S09-F66-P01_C S09-F66-P01_C 2276
## 20 S09-F67-P01_C S09-F66-P01_C 2141
## 25 S09-F67-P01_C S09-F67-P01_C 2561
Upvotes: 3
Reputation: 38520
This seems to work:
cbind(rownames(mydist)[which(lower.tri(mydist, diag=T), arr.ind=T)[,1]],
colnames(mydist)[which(lower.tri(mydist, diag=T), arr.ind=T)[,2]],
mydist[lower.tri(mydist, diag=T)])
Or, transforming it into a data.frame as @akrun does:
temp1 <-data.frame(sampleA=rownames(mydist)[which(lower.tri(mydist, diag=T), arr.ind=T)[,1]],
sampleB=colnames(mydist)[which(lower.tri(mydist, diag=T), arr.ind=T)[,2]],
value=as.numeric(mydist[lower.tri(mydist, diag=T)]), stringsAsFactors=F)
Upvotes: 1
Reputation: 887711
We can try
i1 <- lower.tri(mydist, diag=TRUE)
i2 <- which(i1, arr.ind=TRUE)
data.frame(sampleA = colnames(mydist)[i2[,1]],
sampleB = colnames(mydist)[i2[,2]], value = mydist[i1])
Upvotes: 4