yrx1702
yrx1702

Reputation: 1641

Transform binary vector to binary matrix

I have a binary vector that holds information on whether or not some event happened for some observation:

v <- c(0,1,1,0)

What I want to achieve is a matrix that holds information on all bivariate pairs of observations in this vector. That is, if two observations both have 0 or both have 1 in this vector v, they should get a 1 in the matrix. If one has 0 and the other has 1, they should get a 0 otherwise.

Hence, the goal is this matrix:

     [,1] [,2] [,3] [,4]
[1,]    0    0    0    1
[2,]    0    0    1    0
[3,]    0    1    0    0
[4,]    1    0    0    0

Whether the main diagonal is 0 or 1 does not matter for me.

Is there an efficient and simple way to achieve this that does not require a combination of if statements and for loops? v might be of considerable size.

Thanks!

Upvotes: 4

Views: 231

Answers (4)

Zheyuan Li
Zheyuan Li

Reputation: 73265

If you allow the main diagonal to be 1, then there will always be two unique rows v and 1 - v in this matrix no matter how large v is. Since the matrix is symmetric, it also has two such unique columns. This makes it trivial to construct this matrix.

## example `v`
set.seed(0)
v <- sample.int(2, 10, replace = TRUE) - 1L
#[1] 1 0 0 1 1 0 1 1 1 1

## column expansion from unique columns
cbind(v, 1 - v, deparse.level = 0L)[, 2 - v]
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,]    1    0    0    1    1    0    1    1    1     1
# [2,]    0    1    1    0    0    1    0    0    0     0
# [3,]    0    1    1    0    0    1    0    0    0     0
# [4,]    1    0    0    1    1    0    1    1    1     1
# [5,]    1    0    0    1    1    0    1    1    1     1
# [6,]    0    1    1    0    0    1    0    0    0     0
# [7,]    1    0    0    1    1    0    1    1    1     1
# [8,]    1    0    0    1    1    0    1    1    1     1
# [9,]    1    0    0    1    1    0    1    1    1     1
#[10,]    1    0    0    1    1    0    1    1    1     1

What is the purpose of this matrix?

If there are n0 zeros and n1 ones, the matrix will have dimension (n0 + n1) x (n0 + n1), but there are only (n0 x n0 + n1 x n1) ones in the matrix. So for long vector v, the matrix is sparse. In fact, it has super sparsity, as it has large number of duplicated rows / columns.

Obviously, if you want to store the position of 1 in this matrix, you can simply get it without forming this matrix at all.

Upvotes: 2

LAP
LAP

Reputation: 6685

Another (slightly less efficient) approach than the use of outer would be sapply:

out <- sapply(v, function(x){
  x == v
})
diag(out) <- 0L
out

     [,1] [,2] [,3] [,4]
[1,]    0    0    0    1
[2,]    0    0    1    0
[3,]    0    1    0    0
[4,]    1    0    0    0

microbenchmark on a vector of length 1000:

> test <- microbenchmark("LAP" = sapply(v, function(x){
+   x == v
+ }),
+ "markus" = outer(v, v, `==`), times = 1000, unit = "ms")
> test
Unit: milliseconds
   expr      min       lq     mean   median       uq       max neval
    LAP 3.973111 4.065555 5.747905 4.573002 6.324607 101.03498  1000
 markus 3.515725 3.535067 4.852606 3.694924 4.908930  84.85184  1000

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388817

Another option with expand.grid is to create pairwise combinations of v with itself and since you have values of only 0 and 1, we can find values with 0 and 2. (0 + 0 and 1 + 1).

inds <- rowSums(expand.grid(v, v))
matrix(+(inds == 0 | inds == 2), nrow = length(v))


#     [,1] [,2] [,3] [,4]
#[1,]    1    0    0    1
#[2,]    0    1    1    0
#[3,]    0    1    1    0
#[4,]    1    0    0    1

Since, the diagonal element are not important for you, I will keep it as it is or if you want to change you can use diag as shown in @markus's answer.

Upvotes: 2

markus
markus

Reputation: 26343

We can use outer

out <- outer(v, v, `==`)
diag(out) <- 0L # as you don't want to compare each element to itself
out
#     [,1] [,2] [,3] [,4]
#[1,]    0    0    0    1
#[2,]    0    0    1    0
#[3,]    0    1    0    0
#[4,]    1    0    0    0

Upvotes: 5

Related Questions