ESKim
ESKim

Reputation: 432

Using lapply to each matrix of list

I'm trying to use lapply to each matrix of the list.

I want to apply sample function, using lapply.

Let's take an example. I generated the probability, which will be used for sample function. (Sorry for not optimizing.)

set.seed(1001)
given<-replicate(3,list(matrix(unlist(replicate(5,sample(c(0.2,0.3,0.4,0.1),4,replace=FALSE),simplify=FALSE)),ncol=4)))
given   


   [[1]]
     [,1] [,2] [,3] [,4]
[1,]  0.1  0.4  0.2  0.4
[2,]  0.3  0.2  0.1  0.2
[3,]  0.2  0.1  0.1  0.3
[4,]  0.4  0.3  0.3  0.1
[5,]  0.3  0.4  0.2  0.4

[[2]]
     [,1] [,2] [,3] [,4]
[1,]  0.4  0.4  0.3  0.4
[2,]  0.3  0.1  0.4  0.2
[3,]  0.1  0.2  0.1  0.4
[4,]  0.2  0.1  0.3  0.3
[5,]  0.3  0.2  0.2  0.1

[[3]]
     [,1] [,2] [,3] [,4]
[1,]  0.3  0.2  0.2  0.1
[2,]  0.2  0.3  0.3  0.3
[3,]  0.1  0.4  0.2  0.2
[4,]  0.4  0.4  0.3  0.4
[5,]  0.1  0.1  0.4  0.1

So this list has three components, each component is a 5*4 matrix. Each row of the matrix(so, it has 15 rows) is given probability. I want to generate 10 samples with a given probability. For simplicity, I'll resample '1' to '4' with a given probability.

With the help of this document(How to generate random data set with predicted probability?), I learned to apply sample function to a component of one matrix. If given were one matrix, I would execute this function.

lapply(1:nrow(given), function(x) sample(1:4, 10, replace = TRUE, prob = given[x, ]))

But, as you can see, given is a list with 3 matrices. I tried several attempts, like prob=given$x, prob=given[[x,]], etc.... but it all failed. Is there a way to apply it?

*additional question

To Ronak Shah

It turned out perfectly right. Thanks!

However, sorry for not asking all quesion. In fact, there was some missing data in probability set.

I'll make one row in given set missing value.

given[[2]][1,]<-NA
given

[[1]]
     [,1] [,2] [,3] [,4]
[1,]  0.1  0.4  0.2  0.4
[2,]  0.3  0.2  0.1  0.2
[3,]  0.2  0.1  0.1  0.3
[4,]  0.4  0.3  0.3  0.1
[5,]  0.3  0.4  0.2  0.4

[[2]]
     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]  0.3  0.1  0.4  0.2
[3,]  0.1  0.2  0.1  0.4
[4,]  0.2  0.1  0.3  0.3
[5,]  0.3  0.2  0.2  0.1

[[3]]
     [,1] [,2] [,3] [,4]
[1,]  0.3  0.2  0.2  0.1
[2,]  0.2  0.3  0.3  0.3
[3,]  0.1  0.4  0.2  0.2
[4,]  0.4  0.4  0.3  0.4
[5,]  0.1  0.1  0.4  0.1

After reading your answer, I manipulate some code in your answers. But the results was quite different.

 lapply(given, function(x) t(sapply(seq_len(nrow(x)), function(y)
 ifelse(is.na(x[y,]),NA,sample(1:4, 10, replace = TRUE, prob = x[y, ])))))

[[1]]
     [,1] [,2] [,3] [,4]
[1,]    4    4    4    2
[2,]    2    3    2    2
[3,]    4    4    1    1
[4,]    1    3    1    1
[5,]    3    3    1    1

[[2]]
     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]    3    4    3    2
[3,]    4    2    2    2
[4,]    4    2    1    1
[5,]    1    2    4    1

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    1    1    2    2
[2,]    3    4    3    4
[3,]    2    3    2    4
[4,]    2    4    4    2
[5,]    2    3    3    3

As you can see, the NA turned out right, but it only generated 4 samples, not 10 samples. Would you please show me how to solve this problem?

Upvotes: 0

Views: 450

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388907

Without over-complicating it too much and continuing from your attempt, we can use sapply inside lapply. lapply would loop over each list whereas sapply would loop over every row in the list.

lapply(given, function(x) t(sapply(seq_len(nrow(x)), function(y) 
             sample(1:4, 10, replace = TRUE, prob = x[y, ]))))

#[[1]]
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,]    2    3    4    4    3    4    4    4    2     1
#[2,]    1    1    1    2    4    1    2    2    2     3
#[3,]    1    4    4    1    4    1    1    2    2     4
#[4,]    1    1    3    2    3    2    3    1    1     3
#[5,]    4    2    3    1    2    2    1    4    1     4

#[[2]]
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,]    1    3    2    3    2    1    1    1    2     1
#[2,]    3    1    1    1    3    3    2    3    1     4
#[3,]    4    3    4    2    4    4    4    4    4     4
#[4,]    3    3    4    4    3    4    4    2    3     4
#[5,]    1    1    2    2    4    1    1    2    1     4

#[[3]]
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,]    3    1    1    2    1    3    3    1    2     1
#[2,]    4    4    3    1    3    3    3    3    2     4
#[3,]    1    1    2    2    2    3    4    4    2     4
#[4,]    2    1    4    4    1    3    3    4    4     1
#[5,]    3    3    3    3    3    3    1    2    3     3

To handle NA values we can do

lapply(given, function(x) t(sapply(seq_len(nrow(x)), function(y) 
      if (anyNA(x[y,])) rep(NA, 10) else 
         sample(1:4, 10, replace = TRUE, prob = x[y, ]))))

Upvotes: 2

Related Questions