Eric W.
Eric W.

Reputation: 814

Changing row and column names on matrices to numbers

I'm constructing an adjacency matrix to use with the bipartite package. Each row and column represents an entity of two different classes, and m[i,j] represents an interaction between entity i of the first class and j of the second. I currently have a data frame df of the form

     s1   s2 weight
1   261  446      1
2   188  259      4
3   144 1119      1

where, for example, row 2 represents an interaction between member 188 of s1 and 259 of s2 of weight 4. So m[259,188] should be 4. However, since not every value between 1 and max(df$s1, df$s2) will be represented, using the normal indexes won't work. If it were possible, I'd want something like this:

        [,144] [,188] [,261] 
 [259,]      0      4      0
 [446,]      0      0      1
[1119,]      1      0      0

I know I can rename columns and rows to a character vector, but I think it would be inefficient/unwieldy to set it to as.character(unique(df$s1)) (and similarly for s2) and index it that way. I also considered keeping a vector of the unique elements of s1 and s2 and using e.g. m[which(unique.s2 == i), which(unique.s1 == j)], but again, that seems like a suboptimal solution. Since not every number between min(s1) and max(s1) will be in the matrix, I can't just make the dimensions c(max(s1), max(s2)) and use the indexes directly.

Is there a better way to accomplish my goal?

Upvotes: 1

Views: 1285

Answers (1)

mdsumner
mdsumner

Reputation: 29477

You can use the row and column names as indices given as character.

First create the matrix with the sorted indices (s2 is rows as per your example).

s1 <- c(261, 188, 144); s2 <- c(446, 259, 1119)
m <- matrix(0, length(s2), length(s1), dimnames = list(as.character(sort(s2)), as.character(sort(s1))))

weight <- c(1, 4, 1)
m[cbind(as.character(s2), as.character(s1))] <- weight

     144 188 261
259    0   4   0
446    0   0   1
1119   1   0   0

m <- matrix(0, 261, 1119)
x[cbind(s1,s2)] <- weight

If you want NA rather than zero as the default value, replace it with as.numeric(NA). You don't specify the number of rows or columns so I just used the maximum.

Upvotes: 2

Related Questions