Reputation: 10131
I'd like to fit models to a grouped data frame and then predict one new value per model (i.e. group).
library(dplyr)
library(broom)
data(iris)
dat <- rbind(iris, iris)
dat$Group <- rep(c("A", "B"), each = 150)
new.dat <- data.frame(Group = rep(c("A", "B"), each = 3),
Species = rep(c("setosa", "versicolor", "virginica"), times = 2),
Sepal.Width = 1:6)
> new.dat
Group Species val
1 A setosa 1
2 A versicolor 2
3 A virginica 3
4 B setosa 4
5 B versicolor 5
6 B virginica 6
However, augment
returns 36 rows, as if each new value is fit with each model. How can I preserve the grouping here and get one fitted value per group?
dat %>%
group_by(Species, Group) %>%
do(augment(lm(Sepal.Length ~ Sepal.Width, data = .), newdata = new.dat))
# A tibble: 36 x 5
# Groups: Species, Group [6]
Group Species Sepal.Width .fitted .se.fit
<fct> <fct> <int> <dbl> <dbl>
1 A setosa 1 3.33 0.221
2 A versicolor 2 4.02 0.133
3 A virginica 3 4.71 0.0512
4 B setosa 4 5.40 0.0615
5 B versicolor 5 6.09 0.145
6 B virginica 6 6.78 0.234
7 A setosa 1 3.33 0.221
8 A versicolor 2 4.02 0.133
9 A virginica 3 4.71 0.0512
10 B setosa 4 5.40 0.0615
# ... with 26 more rows
(Note that due to the example data the rows are actually duplicates, which is however not the case with my original data).
Upvotes: 1
Views: 879
Reputation: 1666
You need to make the Species
and Group
of new.dat
match those of the group currently being processed in do
. You can do this like so:
group.cols <- c("Species", "Group")
dat %>%
group_by(!!! group.cols) %>%
do(augment(lm(Sepal.Length ~ Sepal.Width, data = .),
newdata = semi_join(new.dat, ., by = group.cols)))
Upvotes: 1