Mark Miller
Mark Miller

Reputation: 13103

randomly replace elements in a matrix

I would like to randomly replace elements in a matrix with some specified value, here -99. I tried the first method below and it did not work. Then I tried a different approach, also below, and it did work.

Why does the first method not work? What am I doing incorrectly? Thank you for any advice.

I suspect the second method is better because, apart from working, it allows me to specify the percentage of the elements I want replaced. The first method does not since it can randomly draw the same i,j pairs repeatedly.

Here is the first method, the one that does not work:

# This does not work

set.seed(1234)

ncols    <-  10
nrows    <-   5
NA_value <- -99

my.fake.data <- round(rnorm(ncols*nrows, 20, 5))

my.fake.grid <- matrix(my.fake.data, nrow=nrows, ncol=ncols, byrow=TRUE)
my.fake.grid

random.i <- sample(ncols, round(0.40*nrows*ncols), replace = TRUE)
random.j <- sample(nrows, round(0.40*nrows*ncols), replace = TRUE)

my.fake.grid[random.j, random.i] <- NA_value
my.fake.grid

Here is the second method, the one that does work:

# This works

set.seed(1234)

ncols    <-  10
nrows    <-   5
NA_value <- -99

my.fake.data <- round(rnorm(ncols*nrows, 20, 5))

my.fake.grid <- matrix(my.fake.data, nrow=nrows, ncol=ncols, byrow=TRUE)
my.fake.grid

my.fake.data2 <- c(my.fake.grid)

random.x <- sample(length(my.fake.data2), round(0.40*length(my.fake.data2)), replace = FALSE)

my.fake.data2[random.x] <- NA_value

my.fake.grid2 <- matrix(my.fake.data2, nrow=nrows, ncol=ncols, byrow=FALSE)
my.fake.grid2

Upvotes: 2

Views: 896

Answers (1)

David Arenburg
David Arenburg

Reputation: 92292

Could try

library(data.table) # For faster cross/join, alterantively could use expand.grid
temp <- as.matrix(CJ(seq_len(nrows), seq_len(ncols))) # Create all possible row/column index combinations
indx <- temp[sample(nrow(temp), round(0.4 * nrow(temp))), ] # Sample 40% of them
my.fake.grid[indx] <- NA_value # Replace with -99
sum(my.fake.grid == -99)/(ncols * nrows) # Validating percentage
##[1] 0.4

Upvotes: 3

Related Questions