Reputation: 3441
I have a 100 probabilities that are associated with four unique individuals (AAA:DDD) that I have created and displayed here.
IndID <- as.factor(rep(c("AAA", "BBB", "CCC", "DDD"),25))
Prob <- runif(length(IndID),0,1)
Data <- data.frame(IndID, Prob)
Data <- Data[order(Data$IndID),]
> head(Data)
IndID Prob
1 AAA 0.5860417
5 AAA 0.1824266
9 AAA 0.3301014
13 AAA 0.5048122
17 AAA 0.3717195
21 AAA 0.9090825
> summary(Data)
IndID Prob
AAA:25 Min. :0.01341
BBB:25 1st Qu.:0.19743
CCC:25 Median :0.48315
DDD:25 Mean :0.50475
3rd Qu.:0.81789
Max. :0.99805
I want to bootstrap (sample with replacement) the probabilities the for each individual AAA:DDD. For each iteration I want to discretize the probabilities to 0’s and 1’s using a 0.50 cut off and then sum the vector.
I have created the function below which discretizes and sums.
BiSum <- function(x){
IndBi <- ifelse(x >= 0.50, 1,0)
SumIndBi <- sum(IndBi)
}
I want to apply the function to the probabilities for each individual in a for() loop and use the boot() function as seen below.
require(boot)
SE <- numeric(length(unique(Data$IndID)))
for (i in unique(Data$IndID)){
IndProbs <- Data$Prob[Data$IndID == i]
b <- boot(IndProbs, BiSum, R=10)
SE[i] <- sqrt(var(b$t)) #This is a roundabout way to grab the SE from a boot() object
}
While I think the function BiSum is correct it is incorrectly incorporated into the for()
loop and boot()
function. The loop above results in the error:
Error in statistic(data, original, ...) : unused argument (original).
My goal: for each individual (AAA:DDD) I want to bootstrap Prob, discretize them using 0.50 as a cut off, and sum the resulting 0’s and 1’s. I want to do this R=10 (only low for example but will repeat R=10000 with real data) times for each AAA:DDD and then extract the standard error of the boot()
object.
Suggestions on how to improve my loop above would be appreciated. Specifically how to correctly incorporate the ”statistic” argument to boot()
within the loop.
Thanks in advance.
Upvotes: 1
Views: 563
Reputation: 2300
You have to include index argument idx
for the boot
function
BiSum <- function(x, idx){
IndBi <- ifelse(x[idx] >= 0.50, 1,0)
SumIndBi <- sum(IndBi)
}
Also, to get SE, it may be cleaner to use sd(b$t)
instead sqrt(var(b$t)
Upvotes: 1