Reputation: 63
I am trying to run a correlation test between different columns within a table. I also use the bootstrap method to run the same test. I want to compare the result but found out that those are exactly the same result. So I am wondering is there anything I did it wrong.
df is a 20000 row * 7 column data.table, the first column is key
Below is my bootstrap code. Please help me to check it. Is that possible that the result after the bootstrap will be same as run the whole dataset? Thank you!
n = nrow(df)
cor.small <- function(d,i= c(1:n)){
d2 <- d[i,]
cormat <- cor(d2[,-1,with=FALSE])
upper <- get_upper_tri(cormat)
return(upper)
}
result <- boot(data = df,statistic = cor.small, R= 999)
Upvotes: 0
Views: 219
Reputation: 9295
You should call the boot
function like this (I have used the Iris dataset to work with something, and modified the code a bit in places) :
cor.small <- function(d, i ){
cormat <- cor(d[i ,-1])
upper <- cormat[lower.tri(cormat)]
return(upper)
}
df <- iris[ ,-5]
nsamp = ceil(nrow(df) / 2) # or use a different value
nrun = 10
set.seed(1)
cor.small(df,sample(1:nsamp,nsamp,replace=TRUE))
boot(data = df,statistic = cor.small, R= nrun)
Upvotes: 1