dineshdileep
dineshdileep

Reputation: 805

Randomly Sample entries of a matrix and return the (Row, Column) Indexes in R?

I have a matrix with M rows and N columns. I need to randomly sample different locations in these matrix and return the row indexes and col indexes.

My approach: Say, I want to sample 30 percentage of entries in the matrix. Then, I iterate through the whole matrix, at each point, I toss a biased coin with heads of 30 percent probability and select the location if heads comes. Since, my data is large, this approximately selects 30% of the entries. However, I observe that this is really slow. Is there a way to speed this up? Or a better way to do it?

Upvotes: 1

Views: 2754

Answers (3)

nicola
nicola

Reputation: 24480

If m is your matrix, just try:

arrayInd(sample(length(m),0.3*length(m)),dim(m))

An example:

set.seed(1)
m<-matrix(ncol=6,nrow=6)
arrayInd(sample(length(m),0.3*length(m)),dim(m))      
#      [,1] [,2]
# [1,]    4    2
# [2,]    2    3
# [3,]    2    4
# [4,]    6    5
# [5,]    1    2
# [6,]    4    5
# [7,]    5    5
# [8,]    4    6
# [9,]    6    3
#[10,]    2    1

Upvotes: 4

PascalVKooten
PascalVKooten

Reputation: 21433

My new favorite option:

indexSampler <- function(m, p) {
    matrix(sample(c(TRUE,FALSE), length(m), p = c(p, 1 - p), replace=TRUE), ncol(m))
}

You won't get indices, but you'll get a matrix full of TRUE/FALSE that can be used to index.

It is ridiculously fast (a factor of 1000 for a matrix of 200x200, and also significantly faster for small matrices).

Upvotes: 1

zx8754
zx8754

Reputation: 56004

See this example:

m=2
n=5
SampleSize=0.3

#dummy data
x <- matrix(runif(m*n),nrow=n)

#sample
set.seed(123)
temp <- x
temp[ sample(1:length(temp),round(length(temp)*SampleSize))] <- -9

#index
ix <- temp==-9

ix
#        [,1]  [,2]
# [1,] FALSE FALSE
# [2,] FALSE FALSE
# [3,]  TRUE  TRUE
# [4,]  TRUE FALSE
# [5,] FALSE FALSE

Upvotes: 1

Related Questions