user1398057
user1398057

Reputation: 1159

How can I sample from a matrix given that each column must have a constraint in R?

I currently have a matrix that is 20 by 3 and from this, I am trying to sample out one value per row. However, the constraint is that each column cannot have more than 5 selected values. Here is a sample of my matrix:

  > matrix(1:80, 20, 4, byrow=T)
      [,1] [,2] [,3] [,4]
 [1,]    1    2    3    4
 [2,]    5    6    7    8
 [3,]    9   10   11   12
 [4,]   13   14   15   16
 [5,]   17   18   19   20
 [6,]   21   22   23   24
 [7,]   25   26   27   28
 [8,]   29   30   31   32
 [9,]   33   34   35   36
[10,]   37   38   39   40
[11,]   41   42   43   44
[12,]   45   46   47   48
[13,]   49   50   51   52
[14,]   53   54   55   56
[15,]   57   58   59   60
[16,]   61   62   63   64
[17,]   65   66   67   68
[18,]   69   70   71   72
[19,]   73   74   75   76
[20,]   77   78   79   80

From this, I would like to obtain a matrix where I have randomly sampled one entry per row, but with the constraint that each column must not have more than 5 entries. A sample of what I would like to get is:

 > MM
      [,1] [,2] [,3] [,4]
 [1,]    1    0    0    0
 [2,]    0    0    7    0
 [3,]    0    0   11    0
 [4,]    0    0   15    0
 [5,]    0   18    0    0
 [6,]   21    0    0    0
 [7,]    0    0   27    0
 [8,]    0    0    0   32
 [9,]    0    0   35    0
[10,]    0    0    0   40
[11,]   41    0    0    0
[12,]   45    0    0    0
[13,]   49    0    0    0
[14,]    0   54    0    0
[15,]    0    0    0   60
[16,]    0    0    0   64
[17,]    0   66    0    0
[18,]    0   70    0    0
[19,]    0   74    0    0
[20,]    0    0    0   80

Where the nonzero values are my chosen values. I am hoping to get it in this final format. Would anyone have any ideas how to do this so that I am sampling as independently as possible? Thanks!

Upvotes: 0

Views: 28

Answers (1)

akrun
akrun

Reputation: 887981

Suppose if 'm1' is the input matrix, we create a matrix of 0's with the same dimension as 'm1' ('m2'). We sample the sequence of columns after replicating it by 5 (5*4 = 20), cbind with sequence of rows to create the row/column index. Use that to replace the values in 'm2' by 'm1'.

 m2 <- matrix(0, ncol=ncol(m1), nrow=nrow(m1))
 i1 <-cbind(1:nrow(m1),sample(rep(1:ncol(m1), each=5)))
 m2[i1] <- m1[i1]
 colSums(!!m2)
 #[1] 5 5 5 5

Upvotes: 1

Related Questions