Reputation: 1458
I would like to bootstrap a large data set which contains multiple column and row variables. The following is a simplified re-creation of my data set:
charDataDiff <- data.frame(c('A','B','C'), matrix(1:72, nrow=9))
colnames(charDataDiff) <- c("patchId","s380","s390","s400","s410","s420","s430","s440","s450")
Separate the data using the patchId
as the criteria. This creates three lists: one for each Variable
idColor <- c("A", "B", "C")
(patchSpectrum <- lapply(idColor, function(idColor) charDataDiff[charDataDiff$patchId==idColor,]))
Created the function sampleBoot
to sample the patchSpectrum
sampleBoot <- function(nbootstrap=2, patch=3){
return(lapply(1:nbootstrap, function(i)
{patchSpectrum[[patch]][sample(1:nrow(patchSpectrum[[patch]]),replace=TRUE),]}))}
Example:
sampleBoot(5,3)
Here is where I am stuck:
patchId
list along with each column variable (which the above "sampleBoot" easily accomplish), patchId
sampling list iteration, and Upvotes: 0
Views: 1095
Reputation: 9830
As much as I understand from your question, you may do as follows:
do.call(rbind, lapply(sampleBoot(5, 3), function(x) apply(x[-1], 2, median)))
It crates a table of the medians of 5 samplings of patch 3.
Upvotes: 1