Minh Chau
Minh Chau

Reputation: 135

How to get adjusted means for ANCOVA python

Supposed I have the following data frame:

df = pd.DataFrame({'water': np.repeat(['daily', 'weekly'], 15),
                   'sun': np.tile(np.repeat(['low', 'med', 'high'], 5), 2),
                   'height': [6, 6, 6, 5, 6, 5, 5, 6, 4, 5,
                              6, 6, 7, 8, 7, 3, 4, 4, 4, 5,
                              4, 4, 4, 4, 4, 5, 6, 6, 7, 8],
                  'phosphorus': [8, 9, 3, 5, 6, 5, 7, 6, 4, 5,
                              6, 6, 7, 8, 8, 3, 4, 4, 4, 15,
                              4, 6, 4, 15, 4, 5, 6, 6, 17, 8]})

I want to perform an ANCOVA, with IV = [water,sun], DV = height, covariate = phosphorus. This is a brief summary of DV by each IV:

water = df.groupby('water').agg({'height': ['count','mean','std','var']}).reset_index()
sun = df.groupby('sun').agg({'height': ['count','mean','std','var']}).reset_index()

# Height by Water              
           count  mean   std   var
0   daily     15  5.87  0.99  0.98
1  weekly     15  4.80  1.37  1.89

# Height by Sun                
         count mean   std   var
0  high     10  6.6  0.97  0.93
1   low     10  4.9  1.10  1.21
2   med     10  4.5  0.71  0.50

Using the OLS model, I perform the following ANCOVA model:

# Fit the ANCOVA Model 
model = sm.formula.ols('height ~ C(sun) + C(water) + phosphorus', data=df).fit()   # build the model
ancova_table = sm.stats.anova_lm(model, typ=2)                                     # fit it & provide table 
alpha = .05

# Print Ancova Table
print(ancova_table)

The ANOVA table indicates that: sun and water are both significant predictors, and phosphorus is also a significant covariate.

            sum_sq    df      F    PR(>F)
C(sun)       19.79   2.0  20.49  5.38e-06
C(water)      9.71   1.0  20.10  1.42e-04
phosphorus    3.20   1.0   6.63  1.64e-02
Residual     12.07  25.0    NaN       NaN

My question is: How can I perform a post-hoc analysis of this ANCOVA model? Specifically, how can I calculate the mean of height for water and sun after adjusting for phosphorus as a covariate? How would it different from the mean before adjusting for the covariate?

# Without Adjustment for covariate?
# Height by Water              
           count  mean   std   var
0   daily     15  5.87  0.99  0.98
1  weekly     15  4.80  1.37  1.89

# Height by Sun               
         count mean   std   var
0  high     10  6.6  0.97  0.93
1   low     10  4.9  1.10  1.21
2   med     10  4.5  0.71  0.50

Upvotes: 2

Views: 112

Answers (0)

Related Questions