Reputation: 432
I'm trying to use lapply
to each matrix of the list.
I want to apply sample
function, using lapply
.
Let's take an example. I generated the probability, which will be used for sample
function. (Sorry for not optimizing.)
set.seed(1001)
given<-replicate(3,list(matrix(unlist(replicate(5,sample(c(0.2,0.3,0.4,0.1),4,replace=FALSE),simplify=FALSE)),ncol=4)))
given
[[1]]
[,1] [,2] [,3] [,4]
[1,] 0.1 0.4 0.2 0.4
[2,] 0.3 0.2 0.1 0.2
[3,] 0.2 0.1 0.1 0.3
[4,] 0.4 0.3 0.3 0.1
[5,] 0.3 0.4 0.2 0.4
[[2]]
[,1] [,2] [,3] [,4]
[1,] 0.4 0.4 0.3 0.4
[2,] 0.3 0.1 0.4 0.2
[3,] 0.1 0.2 0.1 0.4
[4,] 0.2 0.1 0.3 0.3
[5,] 0.3 0.2 0.2 0.1
[[3]]
[,1] [,2] [,3] [,4]
[1,] 0.3 0.2 0.2 0.1
[2,] 0.2 0.3 0.3 0.3
[3,] 0.1 0.4 0.2 0.2
[4,] 0.4 0.4 0.3 0.4
[5,] 0.1 0.1 0.4 0.1
So this list has three components, each component is a 5*4 matrix. Each row of the matrix(so, it has 15 rows) is given probability. I want to generate 10 samples with a given probability. For simplicity, I'll resample '1' to '4' with a given probability.
With the help of this document(How to generate random data set with predicted probability?), I learned to apply sample
function to a component of one matrix. If given
were one matrix, I would execute this function.
lapply(1:nrow(given), function(x) sample(1:4, 10, replace = TRUE, prob = given[x, ]))
But, as you can see, given
is a list with 3 matrices. I tried several attempts, like prob=given$x
, prob=given[[x,]]
, etc.... but it all failed. Is there a way to apply it?
*additional question
To Ronak Shah
It turned out perfectly right. Thanks!
However, sorry for not asking all quesion. In fact, there was some missing data in probability set.
I'll make one row in given
set missing value.
given[[2]][1,]<-NA
given
[[1]]
[,1] [,2] [,3] [,4]
[1,] 0.1 0.4 0.2 0.4
[2,] 0.3 0.2 0.1 0.2
[3,] 0.2 0.1 0.1 0.3
[4,] 0.4 0.3 0.3 0.1
[5,] 0.3 0.4 0.2 0.4
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] 0.3 0.1 0.4 0.2
[3,] 0.1 0.2 0.1 0.4
[4,] 0.2 0.1 0.3 0.3
[5,] 0.3 0.2 0.2 0.1
[[3]]
[,1] [,2] [,3] [,4]
[1,] 0.3 0.2 0.2 0.1
[2,] 0.2 0.3 0.3 0.3
[3,] 0.1 0.4 0.2 0.2
[4,] 0.4 0.4 0.3 0.4
[5,] 0.1 0.1 0.4 0.1
After reading your answer, I manipulate some code in your answers. But the results was quite different.
lapply(given, function(x) t(sapply(seq_len(nrow(x)), function(y)
ifelse(is.na(x[y,]),NA,sample(1:4, 10, replace = TRUE, prob = x[y, ])))))
[[1]]
[,1] [,2] [,3] [,4]
[1,] 4 4 4 2
[2,] 2 3 2 2
[3,] 4 4 1 1
[4,] 1 3 1 1
[5,] 3 3 1 1
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] 3 4 3 2
[3,] 4 2 2 2
[4,] 4 2 1 1
[5,] 1 2 4 1
[[3]]
[,1] [,2] [,3] [,4]
[1,] 1 1 2 2
[2,] 3 4 3 4
[3,] 2 3 2 4
[4,] 2 4 4 2
[5,] 2 3 3 3
As you can see, the NA turned out right, but it only generated 4 samples, not 10 samples. Would you please show me how to solve this problem?
Upvotes: 0
Views: 450
Reputation: 388907
Without over-complicating it too much and continuing from your attempt, we can use sapply
inside lapply
. lapply
would loop over each list whereas sapply
would loop over every row in the list.
lapply(given, function(x) t(sapply(seq_len(nrow(x)), function(y)
sample(1:4, 10, replace = TRUE, prob = x[y, ]))))
#[[1]]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 2 3 4 4 3 4 4 4 2 1
#[2,] 1 1 1 2 4 1 2 2 2 3
#[3,] 1 4 4 1 4 1 1 2 2 4
#[4,] 1 1 3 2 3 2 3 1 1 3
#[5,] 4 2 3 1 2 2 1 4 1 4
#[[2]]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 1 3 2 3 2 1 1 1 2 1
#[2,] 3 1 1 1 3 3 2 3 1 4
#[3,] 4 3 4 2 4 4 4 4 4 4
#[4,] 3 3 4 4 3 4 4 2 3 4
#[5,] 1 1 2 2 4 1 1 2 1 4
#[[3]]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 3 1 1 2 1 3 3 1 2 1
#[2,] 4 4 3 1 3 3 3 3 2 4
#[3,] 1 1 2 2 2 3 4 4 2 4
#[4,] 2 1 4 4 1 3 3 4 4 1
#[5,] 3 3 3 3 3 3 1 2 3 3
To handle NA
values we can do
lapply(given, function(x) t(sapply(seq_len(nrow(x)), function(y)
if (anyNA(x[y,])) rep(NA, 10) else
sample(1:4, 10, replace = TRUE, prob = x[y, ]))))
Upvotes: 2