Zachary
Zachary

Reputation: 381

Transform adjacency lists to binary matrix in R

Given a list of the locations of 1s in each row, I'm trying to find an efficient way to construct a binary matrix. Here's a small example, although I’m trying to find something that scales well -

Given a binary matrix:

> M <- matrix(rbinom(25,1,0.5),5,5)
> M
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    1    1    1    0
[2,]    0    1    1    1    1
[3,]    1    1    0    1    1
[4,]    1    0    0    1    0
[5,]    0    1    1    0    0

I can transform M into an adjacency list using:

> Mlist <- apply(M==1, 1, which, simplify = FALSE)
> Mlist
[[1]]
[1] 2 3 4

[[2]]
[1] 2 3 4 5

[[3]]
[1] 1 2 4 5

[[4]]
[1] 1 4

[[5]]
[1] 2 3

I'd like to transform Mlist back into M. One possibility is:

M.new <- matrix(0,5,5)
for (row in 1:5){M.new[row,Mlist[[row]]] <- 1}

But, it seems like there should be a more efficient way.

Thanks!

Upvotes: 2

Views: 165

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269674

1) Using M and Mlist defined in the Note at the end, sapply over its components replacing a vector of zeros with ones at the needed locations. Transpose at the end.

M2 <- t(sapply(Mlist, replace, x = integer(length(Mlist)), 1L))

identical(M, M2)  # check that M2 equals M
## [1] TRUE

2) A variation with slightly more keystrokes, but faster, would be

M3 <- do.call("rbind", lapply(Mlist, replace, x = integer(length(Mlist)), 1L))

identical(M, M3)
## [1] TRUE

Benchmark

Here ex1 and ex2 are (1) and (2) above and ex0 is the for loop in the question except we used integer instead of double. Note that (2) is about 100x faster then the loop in the question.

library(microbenchmark)

microbenchmark(
  ex0 = { M.new <- matrix(0L,5,5); for (row in 1:5){M.new[row,Mlist[[row]]] <- 1L} },
  ex1 = t(sapply(Mlist, replace, x = integer(length(Mlist)), 1L)),
  ex2 = do.call("rbind", lapply(Mlist, replace, x = integer(length(Mlist)), 1L))
)

giving:

Unit: microseconds
 expr    min      lq     mean median      uq    max neval cld
  ex0 4454.4 4504.15 4639.111 4564.1 4670.10 8450.2   100   b
  ex1   73.1   84.75   98.220   94.3  111.75  130.8   100  a 
  ex2   32.0   36.20   43.866   42.7   51.85   82.5   100  a 

Note

set.seed(123)
M <- matrix(rbinom(25,1,0.5),5,5)
Mlist <- apply(M==1, 1, which, simplify = FALSE)

Upvotes: 2

akrun
akrun

Reputation: 887213

Using the vectorized row/column indexing - replicate the sequence of 'Mlist' by the lengths of the 'Mlist', and cbind with the unlisted 'Mlist' to create a matrix which can be used to assign the subset of elements of 'M.new' to 1

ind <- cbind(rep(seq_along(Mlist), lengths(Mlist)), unlist(Mlist))
M.new[ind] <- 1

-checking

> all.equal(M, M.new)
[1] TRUE

Or another option is sparseMatrix

library(Matrix)
as.matrix(sparseMatrix(i = rep(seq_along(Mlist), lengths(Mlist)),
      j = unlist(Mlist), x = 1))
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    0    1    1    1
[2,]    0    1    0    1    0
[3,]    1    0    0    1    0
[4,]    0    1    0    1    0
[5,]    1    0    1    1    1

Upvotes: 1

Related Questions