Jordan
Jordan

Reputation: 614

Implementing loo_cv from rsample in tidymodels

I'm new to tidymodels syntax and would like to implement leave one out cross validation using loo_cv from rsample in a tidymodel framework. However, the implementation seems different from vfold_cv and I can't find any helpful examples that implement loo_cv. Yes, I've checked the help page for examples

I would like to emulate a similar type of workflow as illustrated below from the fit_resamples() help page, but I cannot find a similar example for loo_cv. Modifying the below code with loo_cv notifies me that fit_resamples does not support loo_cv but I do not know what does support it. I assume the right solution will involve fit_split() but I cannot get that to work either. I have been Googling and generating error messages for hours though I imagine the solution will be quite simple. Thank you in advance for any direction!

folds <- vfold_cv(mtcars, v = 5)
#folds <- loo_cv(mtcars) # generates error message with fit_resamples()
spline_rec <- recipe(mpg ~ ., data = mtcars) %>%
step_ns(disp) %>%
step_ns(wt)

lin_mod <- linear_reg() %>%
set_engine("lm")

control <- control_resamples(save_pred = TRUE)

spline_res <- fit_resamples(lin_mod, spline_rec, folds, control = control)

spline_res %>% 
collect_predictions

Upvotes: 4

Views: 758

Answers (2)

topepo
topepo

Reputation: 14316

We don't really support LOO in tidymodels. It's a fairly deprecated method and you'd be better off using the bootstrap or many repeats of 10-fold CV.

We may work with it in the future but, from a support point-of-view, the overhead of that method is fairly high (since it behaves differently than all other methods). We'd rather spend time on other missing features for now.

Upvotes: 3

Jordan
Jordan

Reputation: 614

The following code works but I don't think it is really capturing the efficiency or utility of the tidymodels approach. Would still love a better suggestion.

loocvdat <- loo_cv(mtcars)

lm_spec <- linear_reg() %>% 
set_engine("lm")

splitfun <- function(mysplit){
  fit_split(mpg~.,
        model=lm_spec,
        split=mysplit) %>% 
  collect_predictions}

map(loocvdat$splits,splitfun)

Upvotes: 0

Related Questions