Reputation: 21274
I want to use summarise
/across
with lm
to fit regressions using different columns in a tibble. Like this:
library(tidyverse)
library(broom)
fits <- tibble(mtcars) %>%
summarise(across(c(vs, am), ~list(tidy(lm(wt ~ .x + mpg)))))
But the columns that get passed into lm
as '.x'
, end up labeled as .x
in the regression output.
fits %>% unnest(vs)
# A tibble: 3 x 6
term estimate std.error statistic p.value am
<chr> <dbl> <dbl> <dbl> <dbl> <list>
1 (Intercept) 6.10 0.353 17.3 8.36e-17 <tibble [3 × 5]>
2 .x 0.0738 0.239 0.308 7.60e- 1 <tibble [3 × 5]>
3 mpg -0.145 0.0200 -7.24 5.63e- 8 <tibble [3 × 5]>
I can preserve the name if I build the lm
formula on the fly, and use cur_column()
, but this feels kludgy:
tibble(mtcars) %>%
summarise(across(c(vs, am),
~list(tidy(lm(formula(paste0("wt ~ ", cur_column(), " + mpg"))))))) %>%
unnest(vs)
# A tibble: 3 x 6
term estimate std.error statistic p.value am
<chr> <dbl> <dbl> <dbl> <dbl> <list>
1 (Intercept) 6.10 0.353 17.3 8.36e-17 <tibble [3 × 5]>
2 vs 0.0738 0.239 0.308 7.60e- 1 <tibble [3 × 5]>
3 mpg -0.145 0.0200 -7.24 5.63e- 8 <tibble [3 × 5]>
I want the output to correctly use the true column name of .x
, without having to do this workaround, but still using the summarise
/across
motif, without incorporating map
.
Seems like this should be possible. Any suggestions?
*copying my comment from @akrun's answer to clarify what i'm looking for:
What I really want to know is, is the column name preserved in the summarise/across operation in a way that I can reference it directly in lm. Something like {{.x}} or rlang::as_name(.x). I mean, I know those don't work, but it seems like name information should be preserved, aside from just the string version in cur_column.
Upvotes: 0
Views: 47
Reputation: 887651
Can make it shorter with reformulate
library(dplyr)
library(broom)
library(tidyr)
tibble(mtcars) %>%
summarise(across(c(vs, am), ~
list(tidy(lm(reformulate(c(cur_column(), "mpg"), "wt")))))) %>%
unnest(vs)
-output
# A tibble: 3 x 6
# term estimate std.error statistic p.value am
# <chr> <dbl> <dbl> <dbl> <dbl> <list>
#1 (Intercept) 6.10 0.353 17.3 8.36e-17 <tibble [3 × 5]>
#2 vs 0.0738 0.239 0.308 7.60e- 1 <tibble [3 × 5]>
#3 mpg -0.145 0.0200 -7.24 5.63e- 8 <tibble [3 × 5]>
Upvotes: 1