Reputation: 17388
I would like to create replicate predictions for one integer independent variable (iv1) given some model and a data frame called training. This is my current approach. I appreciate this is not self containing but hopefully it is self explanatory:
number_of_samples <- 10
results <- NULL
for (row in 1:nrow(training)) {
fake_iv1_values <- sample(1:100, number_of_samples)
case <- training[row,]
for (iv1 in fake_iv1_values) {
case$iv1 <- iv1
case$prediction <- predict(some_model, newdata = case)
results <- rbind(results, case)
}
}
Using loops is very slow. I wonder, if this could be sped up? Thanks!
Upvotes: 0
Views: 43
Reputation: 7818
Try with this.
Reproducible fake data and model:
# create fake data
n_row <- 100
n_xs <- 100
training <- data.frame(y = rnorm(n_row), iv1 = rnorm(n_row))
training[, paste0("x",1:n_xs)] <- replicate(n_xs, list(rnorm(n_row)))
# example model
some_model <- lm(y~., training)
Rewritten code:
number_of_samples <- 10
results <- NULL
# vector of several fake_iv1_values vectors
fake_iv1_values <- as.numeric(replicate(nrow(training), sample(1:100, number_of_samples)))
# replicate each row of the original dataframe
results <- training[rep(seq_len(nrow(training)), each = number_of_samples), ]
# add fake values to the replicated dataframe
results$iv1 <- fake_iv1_values
# get predictions
results$prediction <- predict(some_model, newdata = results)
Upvotes: 1