Reputation: 1159
I currently have a matrix that is 20 by 3 and from this, I am trying to sample out one value per row. However, the constraint is that each column cannot have more than 5 selected values. Here is a sample of my matrix:
> matrix(1:80, 20, 4, byrow=T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
[5,] 17 18 19 20
[6,] 21 22 23 24
[7,] 25 26 27 28
[8,] 29 30 31 32
[9,] 33 34 35 36
[10,] 37 38 39 40
[11,] 41 42 43 44
[12,] 45 46 47 48
[13,] 49 50 51 52
[14,] 53 54 55 56
[15,] 57 58 59 60
[16,] 61 62 63 64
[17,] 65 66 67 68
[18,] 69 70 71 72
[19,] 73 74 75 76
[20,] 77 78 79 80
From this, I would like to obtain a matrix where I have randomly sampled one entry per row, but with the constraint that each column must not have more than 5 entries. A sample of what I would like to get is:
> MM
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 0 7 0
[3,] 0 0 11 0
[4,] 0 0 15 0
[5,] 0 18 0 0
[6,] 21 0 0 0
[7,] 0 0 27 0
[8,] 0 0 0 32
[9,] 0 0 35 0
[10,] 0 0 0 40
[11,] 41 0 0 0
[12,] 45 0 0 0
[13,] 49 0 0 0
[14,] 0 54 0 0
[15,] 0 0 0 60
[16,] 0 0 0 64
[17,] 0 66 0 0
[18,] 0 70 0 0
[19,] 0 74 0 0
[20,] 0 0 0 80
Where the nonzero values are my chosen values. I am hoping to get it in this final format. Would anyone have any ideas how to do this so that I am sampling as independently as possible? Thanks!
Upvotes: 0
Views: 28
Reputation: 887981
Suppose if 'm1' is the input matrix, we create a matrix of 0's with the same dimension as 'm1' ('m2'). We sample
the sequence of columns after replicating it by 5 (5*4 = 20), cbind
with sequence of rows to create the row/column index. Use that to replace the values in 'm2' by 'm1'.
m2 <- matrix(0, ncol=ncol(m1), nrow=nrow(m1))
i1 <-cbind(1:nrow(m1),sample(rep(1:ncol(m1), each=5)))
m2[i1] <- m1[i1]
colSums(!!m2)
#[1] 5 5 5 5
Upvotes: 1