Reputation: 1909
It is fairly straightforward to use the augment
function from the Broom package in R to add predictions back into a tibble. Viz.
df <- iris %>%
nest(data = everything()) %>%
mutate(model = map(data, function(x) lm(Sepal.Length ~ Sepal.Width, data = x)),
pred = map2(model, data, ~augment(.x, newdata = .y))) %>%
unnest(pred)
However, when I take a linear model trained on one set of data and try and predict on new data I receive the following error.
mod <- lm(Sepal.Length ~ Sepal.Width, data = iris)
df2 <- iris %>%
mutate(Sepal.Width = Sepal.Width + rnorm(1)) %>%
nest(data = everything()) %>%
mutate(pred = map2(mod, data, ~augment(.x, newdata = .y)))
# Error: Problem with `mutate()` input `pred`.
# x No augment method for objects of class numeric
# i Input `pred` is `map2(mod, data, ~augment(.x, newdata = .y))`.
How should I use augment
to fit new data? Is using an external model object (in the example above this is mod
) the best practice or is there a more elegant way?
Upvotes: 0
Views: 2425
Reputation: 389065
Since there is only one model we can do this without using map
.
library(dplyr)
df1 <- iris %>%
mutate(Sepal.Width = Sepal.Width + rnorm(1)) %>%
tidyr::nest(data = everything()) %>%
summarise(pred = broom::augment(mod, newdata = data[[1]]),
mod = list(mod),
data = data)
Upvotes: 1
Reputation: 1909
Having just posted the question, I think I have an answer. I won't accept the answer for 48 hours just in case someone contradicts or provides a more comprehensive one.
In the example, map2
expects mod
as a vector or list but it is a model object. Putting mod
into the tibble as a list object suppresses the error and correctly calculates predictions.
mod <- lm(Sepal.Length ~ Sepal.Width, data = iris)
df2 <- iris %>%
mutate(Sepal.Width = Sepal.Width + rnorm(1)) %>%
nest(data = everything()) %>%
mutate(mod = list(mod)) %>% #! this is the additional step
mutate(pred = map2(mod, data, ~augment(.x, newdata = .y))) %>%
unnest(pred)
Alternatively, coerce the external model object as list.
...
mutate(pred = map2(list(mod), data, ~augment(.x, newdata = .y))) %>%
...
Upvotes: 0