Reputation: 69

Create a binary matrix under certain conditions

I am trying to create a function that given m and p returns a matrix with m rows and mxp columns. The matrix should have 0's except for p positions, starting at p (number of row).

For example, given m=4 and p=2, the matrix should look like:

1    1    0    0    0    0    0    0
0    0    1    1    0    0    0    0
0    0    0    0    1    1    0    0
0    0    0    0    0    0    1    1

I want to work with big matrices. I know how to do this with loops in other programming languages such as python, but I am sure that it should be an easier and more elegant way to do this in R. I have been playing for a while with diag() without finding the solution.

Upvotes: 2

Answers (6)

lmo

Reputation: 38500

This method uses matrix subsetting to fill in the 1s.

myMatFunc <- function(m, p) {
  # initialize matrix of correct size, filled with 0s
  myMat <- matrix(0L, m, m * p)
  #fill in 1s using matrix subsetting
  myMat[cbind(rep(seq_len(m), each=p), seq_len(m * p))] <- 1L

  myMat
}

then,

myMatFunc(4, 2)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]    1    1    0    0    0    0    0    0
[2,]    0    0    1    1    0    0    0    0
[3,]    0    0    0    0    1    1    0    0
[4,]    0    0    0    0    0    0    1    1

Thanks to the comments from @joseph-wood, @jogo, and @A5C1D2H2I1M1N2O1R2T1 below, I improved the efficiency removing a call to nrow and a call to ncol, cut the size of the matrix in half by converting to integers, and fixed an initial testing typo.

Upvotes: 4

M--

Reputation: 28826

Here is another way to do it but I would choose @989 answer over mine;

 cadv.func = function(m,p)
{

  cmat <- matrix(data=NA,nrow=m,ncol=m*p)
  cmat[is.na(cmat)] <- 0

  for (i in 1:m){
    for (j in 1:p){

    cmat[i,j+p*(i-1)] = 1

  } 
  }

  return(cmat)
}

cadv.func(4,2)


 #       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
 # [1,]    1    1    0    0    0    0    0    0
 # [2,]    0    0    1    1    0    0    0    0
 # [3,]    0    0    0    0    1    1    0    0
 # [4,]    0    0    0    0    0    0    1    1

Upvotes: 1

Joseph Wood

Reputation: 7597

Here is a base R solution that is pretty fast:

Joseph <- function(m, p) {
  mat <- matrix(0L, nrow = m, ncol = m*p)
  for (i in 1:m) {mat[i, p*(i-1L) + 1:p] <- 1L}
  mat
}

Here are some equality comparisons:

fun989 <- function(m, p){
  a <- diag(m)
  a[,rep(seq_len(m), each=p)]
}

IMO <- function(m, p) {
  myMat <- matrix(0L, m, m*p)
  myMat[cbind(rep(seq_len(nrow(myMat)), each=p), seq_len(ncol(myMat)))] <- 1
  myMat
}

JOGO <- function(m, p) {matrix(rep(diag(m), each = p), nrow = m, byrow = TRUE)}
APOM <- function(m, p) {t(apply(diag(m), 2, rep, each = p))}

library(compiler)
enableJIT(3)  ## compiling each function
all.equal(Joseph(100, 50), fun989(100, 50))
[1] TRUE
all.equal(Joseph(100, 50), APOM(100, 50))
[1] TRUE
all.equal(Joseph(100, 50), JOGO(100, 50))
[1] TRUE
all.equal(Joseph(100, 50), IMO(100, 50))
[1] TRUE
enableJIT(0)  ## return to standard setting

Here are the benchmarks:

library(microbenchmark)

microbenchmark(Joseph(100, 50), JOGO(100, 50), fun989(100, 50), APOM(100, 50), IMO(100, 50), unit = "relative")
Unit: relative
           expr       min        lq     mean    median        uq      max neval cld
Joseph(100, 50)  1.000000  1.000000 1.000000  1.000000  1.000000 1.000000   100  a 
  JOGO(100, 50) 33.388929 20.892988 6.593804 22.365625 19.161056 1.167957   100   b
fun989(100, 50)  7.192071  4.577225 2.044973  4.432824  4.129563 1.029050   100  a 
  APOM(100, 50) 40.244128 28.176729 8.805715 27.785985 23.966477 1.209582   100   b
   IMO(100, 50)  6.119685  3.898451 2.712222  6.192030  6.033916 1.044422   100  a

Upvotes: 3

Aurèle

Reputation: 12819

apply()ing the rep() function to each row (or column, it's the same) of the diagonal matrix:

t(apply(diag(m), 2, rep, each = p))

#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,]    1    1    0    0    0    0    0    0
# [2,]    0    0    1    1    0    0    0    0
# [3,]    0    0    0    0    1    1    0    0
# [4,]    0    0    0    0    0    0    1    1

Upvotes: 5

989

Reputation: 12937

How about this:

f <- function(m, p){
     a <- diag(m)
     a[,rep(seq_len(m), each=p)]
}

> f(m = 4, p = 2)

#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,]    1    1    0    0    0    0    0    0
#[2,]    0    0    1    1    0    0    0    0
#[3,]    0    0    0    0    1    1    0    0
#[4,]    0    0    0    0    0    0    1    1

> f(m = 3, p = 4)

#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,]    1    1    1    1    0    0    0    0    0     0     0     0
#[2,]    0    0    0    0    1    1    1    1    0     0     0     0
#[3,]    0    0    0    0    0    0    0    0    1     1     1     1

The idea is to first create a diagonal matrix of size m (which we name a) and then repeat each column of that matrix p times (so m*p matrix).

Upvotes: 4

jogo

Reputation: 12559

This solution for p=2 uses the change of the number of rows:

m <- 4
d <- diag(m)
matrix(rbind(d,d), m)
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,]    1    1    0    0    0    0    0    0
# [2,]    0    0    1    1    0    0    0    0
# [3,]    0    0    0    0    1    1    0    0
# [4,]    0    0    0    0    0    0    1    1

For other values of p (from the comment of A5C1D2H2I1M1N2O1R2T1):

p <- 3; m <- 4
matrix(rep(diag(m), each = p), nrow = m, byrow = TRUE)

Upvotes: 5

Create a binary matrix under certain conditions

Answers (6)

Related Questions