Reputation: 23898
This is a follow up of this question. I want to draw random sample from each row of a data.frame
independently from other rows. The data.frame
may contains NAs as given in the given data.frame
df
.
set.seed(12345)
df1 <- c(rnorm(n=4, mean=0, sd=1), NA)
df2 <- rnorm(n=5, mean=10, sd=1)
df <- rbind(df1, df2)
t(apply(df, 1, sample, replace=TRUE))
[,1] [,2] [,3] [,4] [,5]
df1 0.5855288 NA -0.1093033 0.709466 NA
df2 9.7238159 9.723816 8.1820440 9.723816 10.6301
From the first row I want to select four observations (non-empty columns) with replacement and from second row I want to select five observations (non-empty columns) with replacement independently from first selection. But my given code selects five observations with replacement from first row and five observations with replacement from second row.
Upvotes: 2
Views: 548
Reputation: 886938
I guess you want to sample
only with the non-NA values. In that case, !is.na
can be useful to remove the NA values and then we sample
on the remaining values. The output will be a list
('lst') as the number of elements differ (4 and 5) for each row after the sample
.
lst <- apply(df, 1, function(x) sample(x[!is.na(x)], replace=TRUE))
If we need to reconvert the list
to matrix
, we can append 'NA' at the end to make the lengths same for each of the list
elements and we use rbind
to convert it back to matrix
.
do.call(rbind,lapply(lst, `length<-`, max(lengths(lst))))
Upvotes: 1