Neuls
Neuls

Reputation: 123

Doing a linear fit for each group of a data frame, check heteroscedasticity

I have a data frame like this:

ORD   exp type         mu
1   Combi pH=7 exp_F   mu 0.15637365
2   Combi pH=7 exp_F   mu 0.12817901
3   Combi pH=7 exp_F   mu 0.13392221
4   Combi pH=7 exp_F   mu 0.09683254
5   Combi pH=7 exp_F   mu 0.11249738
6   Combi pH=7 exp_F   mu 0.10878719
7   Combi pH=7 exp_F   mu 0.11019295
8   Combi pH=7 exp_F   mu 0.12100511
9   Combi pH=7 exp_F   mu 0.09803942
10  Combi pH=7 exp_F   mu 0.13842086
11  Combi pH=7 exp_F   mu 0.12778964
12     ORD0793 exp_F   mu 0.13910441
13     ORD0793 exp_F   mu 0.12603702
14     ORD0793 exp_F   mu 0.12670842
15     ORD0795 exp_F   mu 0.12982122
16     ORD0795 exp_F   mu 0.13648100
17     ORD0795 exp_F   mu 0.13593685
18     ORD0799 exp_F   mu 0.13906691
continues...

I would like to do a linear adjust like lm(mu~ORD, data=df) but for each group of type and exp. I have tried the following but its not working..:

intsl <- df %>% group_by(exp,type) %>% 
  fortify(lm(mu~ORD)) %>% 
  select(exp,type, .fitted, .resid) 

I need to use fortify because I need .fitted and .resid fields to later on do a multiplot sorting plots by type and exp using facet_grid contained in ggplot in order to check if there's heteroscedasticity in each fitted model.. like but in an orgnanized multiplot: enter image description here

Any suggestions? :<

Upvotes: 4

Views: 67

Answers (1)

Oriol Mirosa
Oriol Mirosa

Reputation: 2826

The documentation for fortify() in the ggplot2 package says that the method will be deprecated and that the broom package should be used instead. Based on the info here, you should do something like this:

library(dplyr)
library(broom)

intsl <- df %>%
  group_by(exp, type) %>%
  do(fit = lm(mu ~ ORD, .)

intsl %>% augment(fit)

This should give you the data frame with the variables you used to group your regressions, the regression variables, and the extra output for each observation such as .fitted and .resid, so you can move on to plot them with ggplot directly.

Upvotes: 1

Related Questions