ohnoplus
ohnoplus

Reputation: 1325

iterating over formulas in purrr

I have a bunch of formulas, as strings, that I'd like to use, one at a time in a glm, preferably using tidyverse functions. Here's where I am at now.

library(tidyverse)
library(broom)

mtcars %>% dplyr::select(mpg:qsec) %>% colnames -> targcols
paste('vs ~ ', targcols) -> formulas
formulas

#> 'vs ~  mpg' 'vs ~  cyl' 'vs ~  disp' 'vs ~  hp' 'vs ~  drat' 'vs ~  wt' 'vs ~  qsec' 

I can run a general linear model with any one of these formulas as

glm(as.formula(formulas[1]), family = 'binomial', data = mtcars) %>% glance

#>  null.deviance,  df.null,    logLik, AIC,    BIC,    deviance,   df.residual
#> 43.86011,    31,     -12.76667,  29.53334,   32.46481,   25.53334,   30 

I'd like to run the glm with every possible formula in the list. I tried doing that as follows.

data.frame(formulas = formulas) %>%
    mutate(mod = map(formulas, function(fs){
        glm(as.formula(fs), family = 'binomial', data = mtcars)
    }))

But then I get the following error message:

Error in mutate_impl(.data, dots): Evaluation error: invalid formula. Traceback:

1. data.frame(formulas = formulas) %>% mutate(mod = map(formulas,   .     function(fs) {  .         glm(as.formula(fs), family =
       "binomial", data = mtcars)  .     }))
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. mutate(., mod = map(formulas, function(fs) {  .     glm(as.formula(fs), family = "binomial", data = mtcars)  . }))
10. mutate.data.frame(., mod = map(formulas, function(fs) {   .     glm(as.formula(fs), family = "binomial", data = mtcars)   . }))
11. as.data.frame(mutate(tbl_df(.data), ...))
12. mutate(tbl_df(.data), ...)
13. mutate.tbl_df(tbl_df(.data), ...)
14. mutate_impl(.data, dots)

Could somebody tell me what I am missing here? Thanks for any advice.

Upvotes: 1

Views: 614

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226192

The problem is that you're using data.frame(); I'm not 100% sure why this doesn't work, but I think it's because data frames don't smoothly handle list columns.

Changing data.frame to tibble works for me. (It's from the tibble package, also exported via dplyr, so it should be available after library("tidyverse"))

You can shorten your code a little bit:

tibble(formulas) %>%
    mutate(mod = map(formulas, 
                      ~  glm(as.formula(.),
                             family = 'binomial', data = mtcars)))

Upvotes: 6

Related Questions