Reputation: 1454
The following function extracts a linear regression model from a dataset.
eqlabels <- function(data, x, y){
m <- lm(x ~ y, data);
eq <- substitute(italic(y) == a + b * italic(x) * "," ~~ italic(r) ^ 2 ~ "=" ~ r2,
list(a = format(coef(m)[1], digits = 3),
b = format(coef(m)[2], digits = 3),
r2 = format(summary(m)$r.squared, digits = 2)))
as.character(as.expression(eq));
return(eq)
}
We can see it working
s <- eqlabels(data = iris, x = iris$Sepal.Length, y = iris$Sepal.Width)
s
italic(y) == "6.53" + "-0.223" * italic(x) * "," ~ ~italic(r)^2 ~
"=" ~ "0.014"
Question is how do I use this function with dplyr to calculate the equation and r2 values not for one group, but for several groups. For example:
result <- iris %>% group_by(Species) %>% eqlabels(x = iris$Sepal.Length, y = iris$Sepal.Width)
This seems to work but it only produces one, rather than three sets of equations and r2.
> result
italic(y) == "6.53" + "-0.223" * italic(x) * "," ~ ~italic(r)^2 ~
"=" ~ "0.014"
?do seems to be a dplyr function for this but I can't get it to work...
result <- iris %>% group_by(Species) %>% do(eqlabels(x = iris$Sepal.Length, y = iris$Sepal.Width),.)
This expression halts the function...
Please note that I'm trying to avoid using ddply from the plyr package. Thank you
Upvotes: 1
Views: 75
Reputation: 24945
Try:
result <- iris %>% group_by(Species) %>%
summarise(labels = list(eqlabels(., x = .$Sepal.Length, y = .$Sepal.Width)))
Source: local data frame [3 x 2]
Species labels
(fctr) (chr)
1 setosa <call[3]>
2 versicolor <call[3]>
3 virginica <call[3]>
If you want dplyr
to work nicely with group_by
, you need to pass it a mutate
, summarise
or do
, rather than your own function. The output from your function is also not super nice - I've wrapped it in a list:
result$labels[[1]]
italic(y) == "6.53" + "-0.223" * italic(x) * "," ~ ~italic(r)^2 ~
"=" ~ "0.014"
As the above comment mentions, you should use the broom
package, it will make your life much easier.
Upvotes: 1