Reputation: 843
I have a very large matrix that has column names and row names. These column and row names are identical and contain a three letter string with a number.
The three letter string repeats itself and only the number changes. After several repetitions, the string changes and the number starts from 1 again.
To give a small example, I have a matrix a
:
a <- matrix(c(1:36), nrow = 6, byrow = TRUE)
names <- paste(rep(c("aaa" , "bbb", "ccc"), each = 2) , rep(c(1:2) , times = 3))
rownames(a) <- names
colnames(a) <- names
that gives:
aaa 1 aaa 2 bbb 1 bbb 2 ccc 1 ccc 2
aaa 1 1 2 3 4 5 6
aaa 2 7 8 9 10 11 12
bbb 1 13 14 15 16 17 18
bbb 2 19 20 21 22 23 24
ccc 1 25 26 27 28 29 30
ccc 2 31 32 33 34 35 36
For each element of this matrix I would like to do a multiplication.
So basically for every element where the string of the row / column names are different, I want to match the number of the row name / column name and multiply with the other string.
If "aaa"
is matched with "bbb"
then:
matrix[aaa (number n), bbb (number m)] * matrix[bbb (number n), aaa (number m)]
if "aaa"
is equal to "aaa"
then
matrix[aaa (number n), aaa (number m)] * matrix[aaa (number n), aaa (number m)]
or basically the element squared.
It looks confusing so I give a few examples:
in matrix["aaa 1", "aaa 2"]
I would multiply matrix["aaa 1", "aaa 2"]
with matrix["aaa 1", "aaa 2"]
(2*2) giving 4
in matrix["aaa 1", "bbb 2"]
I would multiply matrix["aaa 1", "bbb 2"]
with matrix["bbb 1", "aaa 2"]
(4*14) giving 56
In the end, the matrix should give:
aaa 1 aaa 2 bbb 1 bbb 2 ccc 1 ccc 2
aaa 1 1 4 39 56 125 156
aaa 2 49 64 171 200 341 384
bbb 1 39 56 225 256 459 504
bbb 2 171 200 441 484 759 816
ccc 1 125 156 459 504 841 900
ccc 2 341 384 759 816 1225 1296
which I get using terrible code:
b <- a^2
b[1,3] <- a[1,3] * a[3,1]
b[1,4] <- a[1,4] * a[3,2]
b[1,5] <- a[1,5] * a[5,1]
b[1,6] <- a[1,6] * a[5,2]
b[2,3] <- a[2,3] * a[4,1]
b[2,4] <- a[2,4] * a[4,2]
b[2,5] <- a[2,5] * a[6,1]
b[2,6] <- a[2,6] * a[6,2]
b[3,1] <- a[3,1] * a[1,3]
b[3,2] <- a[3,2] * a[1,4]
b[3,5] <- a[3,5] * a[5,3]
b[3,6] <- a[3,6] * a[5,4]
b[4,1] <- a[4,1] * a[2,3]
b[4,2] <- a[4,2] * a[2,4]
b[4,5] <- a[4,5] * a[6,3]
b[4,6] <- a[4,6] * a[6,4]
b[5,1] <- a[5,1] * a[1,5]
b[5,2] <- a[5,2] * a[1,6]
b[5,3] <- a[5,3] * a[3,5]
b[5,4] <- a[5,4] * a[3,6]
b[6,1] <- a[6,1] * a[2,5]
b[6,2] <- a[6,2] * a[2,6]
b[6,3] <- a[6,3] * a[4,5]
b[6,4] <- a[6,4] * a[4,6]
Is there a code that lets me do this type of matching? Maybe also when there are more strings and numbers?
Upvotes: 0
Views: 87
Reputation: 79208
An easy way that is vectorized Using base R you can do:
b = expand.grid(dimnames(a))[2:1]
d = with(read.table(text=paste(b[,1],b[,2])),cbind(paste(V3,V2),paste(V1,V4)))
t(array(a[as.matrix(b)]*a[d],dim(a)))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 4 39 56 125 156
[2,] 49 64 171 200 341 384
[3,] 39 56 225 256 459 504
[4,] 171 200 441 484 759 816
[5,] 125 156 459 504 841 900
[6,] 341 384 759 816 1225 1296
Upvotes: 0
Reputation: 132696
You can use a data.table join:
library(data.table)
#melt to long format
DT <- setDT(melt(a))
#split ID columns
DT[, c("Var1str", "Var1num", "Var2str", "Var2num") := cbind(tstrsplit(Var1, " "),
tstrsplit(Var2, " "))]
#simple join on split IDs, flipping the strings
DT[DT, res := value * i.value, on = c(Var1num = "Var1num", Var2num = "Var2num",
Var1str = "Var2str", Var2str = "Var1str")]
#create wide-format matrix
res <- dcast(DT, Var1 ~ Var2, value.var = "res")
rn <- res[["Var1"]]
res <- as.matrix(res[, -1])
rownames(res) <- rn
res
# aaa 1 aaa 2 bbb 1 bbb 2 ccc 1 ccc 2
#aaa 1 1 4 39 56 125 156
#aaa 2 49 64 171 200 341 384
#bbb 1 39 56 225 256 459 504
#bbb 2 171 200 441 484 759 816
#ccc 1 125 156 459 504 841 900
#ccc 2 341 384 759 816 1225 1296
Upvotes: 3