Ekaterina
Ekaterina

Reputation: 69

Why does my code take so long to process?

I try to run code from this web site in my computer. I use data set from kaggle competition In my training data 1022 rows and 81 variables. I run this code:

hyper_grid <- expand.grid(
  shrinkage = c(.01, .1, .3),
  interaction.depth = c(1, 3, 5),
  n.minobsinnode = c(5, 10, 15),
  bag.fraction = c(.65, .8, 1), 
  optimal_trees = 0,               # a place to dump results
  min_RMSE = 0                     # a place to dump results
)

random_index <- sample(1:nrow(ames_train), nrow(ames_train))
random_ames_train <- ames_train[random_index, ]

# grid search 
for(i in 1:nrow(hyper_grid)) {
  
  # reproducibility
  set.seed(123)
  
  # train model
  gbm.tune <- gbm(
    formula = SalePrice ~ .,
    distribution = "gaussian",
    data = random_ames_train,
    n.trees = 5000,
    interaction.depth = hyper_grid$interaction.depth[i],
    shrinkage = hyper_grid$shrinkage[i],
    n.minobsinnode = hyper_grid$n.minobsinnode[i],
    bag.fraction = hyper_grid$bag.fraction[i],
    train.fraction = .75,
    n.cores = NULL, # will use all cores by default
    verbose = FALSE
  )

I'm waiting more than 1 hour. I think it's bacause my laptop is not powerful. On the picture you can see parameters of my computer. enter image description here

Please, answer: can my computer perform this operation? If yes, how long should I wait?

Upvotes: -1

Views: 97

Answers (1)

jjack
jjack

Reputation: 99

It's taking a long time because you're training 81 GBM models, and GBM's are complex. To get a rough estimate of training time, you could train one model and then multiply that time by 81.

Upvotes: 2

Related Questions