Reputation: 1692
How can I efficiently extract the fitted values from several linear regression models and append them to the original data used to build the models?
Example Data:
library(dplyr)
# Fit several (3 in this case) linear regression models
fitted_models <- iris %>%
group_by(Species) %>%
do(model = lm(Petal.Length~Sepal.Length+Sepal.Width, data = .))
I can extract the fitted values for each group (see below) but this is cumbersome and would be inefficient if you have 10's or 100's of models. How can I more efficiently extract the fitted data from the models and append them back to the dataset used to build the models?
df2 <- iris[,c(5,3)]
df2$predicted <- NA
df2[1:50,3] <- fitted_models$model[[1]]$fitted.values
df2[51:100,3] <- fitted_models$model[[2]]$fitted.values
df2[101:150,3] <- fitted_models$model[[3]]$fitted.values
df2
Upvotes: 2
Views: 1055
Reputation:
Getting used to nested data frames can be helpful for things like this. Here is one approach for your entire problem.
You can find more examples here:
https://cran.r-project.org/web/packages/broom/vignettes/broom_and_dplyr.html
library(dplyr)
library(tidyr)
library(purrr)
fitted_models <- iris %>%
nest(data = -Species) %>%
mutate(fit = map(data, ~ lm(Petal.Length ~ Sepal.Length + Sepal.Width, data = .x)),
fitted.values = map(fit, "fitted.values")) %>%
unnest(cols = c(data, fitted.values)) %>%
select(-fit)
> fitted_models
# A tibble: 150 x 6
Species Sepal.Length Sepal.Width Petal.Length Petal.Width fitted.values
<fct> <dbl> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.1 3.5 1.4 0.2 1.47
2 setosa 4.9 3 1.4 0.2 1.46
3 setosa 4.7 3.2 1.3 0.2 1.42
4 setosa 4.6 3.1 1.5 0.2 1.41
5 setosa 5 3.6 1.4 0.2 1.46
6 setosa 5.4 3.9 1.7 0.4 1.51
7 setosa 4.6 3.4 1.4 0.3 1.40
8 setosa 5 3.4 1.5 0.2 1.46
9 setosa 4.4 2.9 1.4 0.2 1.38
10 setosa 4.9 3.1 1.5 0.1 1.45
# ... with 140 more rows
Upvotes: 2
Reputation: 887223
With the model created, there is rowwise
grouping, so we can directly extract in a list
and unnest
the list column
library(dplyr)
library(tidyr)
fitted_models %>%
transmute(Species, fitted.values = list(model$fitted.values)) %>%
ungroup %>%
unnest(fitted.values)
-output
# A tibble: 150 × 2
Species fitted.values
<fct> <dbl>
1 setosa 1.47
2 setosa 1.46
3 setosa 1.42
4 setosa 1.41
5 setosa 1.46
6 setosa 1.51
7 setosa 1.40
8 setosa 1.46
9 setosa 1.38
10 setosa 1.45
# … with 140 more rows
Upvotes: 3