Reputation: 1
the study design of the data I have to analyse is simple. There is 1 control group (CTRL) and 2 different treatment groups (TREAT_1 and TREAT_2). The data also includes 2 covariates COV1 and COV2. I have been asked to check if there is a linear or quadratic treatment effect in the data.
I created a dummy data set to explain my situation:
df1 <- data.frame(
Observation = c(rep("CTRL",15), rep("TREAT_1",13), rep("TREAT_2", 12)),
COV1 = c(rep("A1", 30), rep("A2", 10)),
COV2 = c(rep("B1", 5), rep("B2", 5), rep("B3", 10), rep("B1", 5), rep("B2", 5), rep("B3", 10)),
Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783, 3491138, 3328894,
3654507, 3465627, 3511446, 3507249, 3373233, 3432867, 3640888,
3677593, 3585096, 3441775, 3608574, 3669114, 4000812, 3503511, 3423968,
3647391, 3584604, 3548256, 3505411, 3665138,
4049955, 3425512, 3834061, 3639699, 3522208, 3711928, 3576597, 3786781,
3591042, 3995802, 3493091, 3674475)
)
plot(Variable ~ Observation, data = df1)
As you can see from the plot there is a linear relationship between the control and the treatment groups. To check if this linear effect is statistical significant I change the contrasts using the contr.poly() function and fit a linear model like this:
contrasts(df1$Observation) <- contr.poly(levels(df1$Observation))
lm1 <- lm(log(Variable) ~ Observation, data = df1)
summary.lm(lm1)
From the summary we can see that the linear effect is statistically significant:
Observation.L 0.029141 0.012377 2.355 0.024 *
Observation.Q 0.002233 0.012482 0.179 0.859
However, this first model does not include any of the two covariates. Including them results in a non-significant p-value for the linear relationship:
lm2 <- lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
summary.lm(lm2)
Observation.L 0.04116 0.02624 1.568 0.126
Observation.Q 0.01003 0.01894 0.530 0.600
COV1A2 -0.01203 0.04202 -0.286 0.776
COV2B2 -0.02071 0.02202 -0.941 0.354
COV2B3 -0.02083 0.02066 -1.008 0.320
So far so good. However, I have been told to conduct a Type II Anova rather than Type I. To conduct a Type II Anova I used the Anova() function from the car package.
Anova(lm2, type="II")
Anova Table (Type II tests)
Response: log(Variable)
Sum Sq Df F value Pr(>F)
Observation 0.006253 2 1.4651 0.2453
COV1 0.000175 1 0.0820 0.7763
COV2 0.002768 2 0.6485 0.5292
Residuals 0.072555 34
The problem here with using Type II is that you do not get a p-value for the linear and quadratic effect. So I do not know if the effect is statistically linear and or quadratic.
I found out that the following code produces the same p-value for Observation as the Anova() function. But the result also does not include any p-values for the linear or quadratic effect:
lm2 <- lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
lm3 <- lm(log(Variable) ~ COV1 + COV2, data = df1)
anova(lm2, lm3)
Does anybody know how to conduct a Type II anova and the contrasts function to obtain the p-values for the linear and quadratic effects?
Help would be very much appreciated.
Best Peter
Upvotes: 0
Views: 1448
Reputation: 41
I found one partial workaround for this, but it may require further correction. The documentation for the function drop1() from the stats package indicates that this function produces Type II sums of squares (although this page: http://www.statmethods.net/stats/anova.html ) declares that drop1() produces Type III sums of squares, and I didn't spend too much time poring over this (http://afni.nimh.nih.gov/sscc/gangc/SS.html) to cross-check sums of squares calculations. You could use it to calculate everything manually, but I suspect you're asking this question because it would be nice if someone had already worked through it.
Anyway, I added a second vector to the dummy data called Observation2, and set it up with just the linear contrasts (you can only specify one set of contrasts for a given vector at a given time):
df1[,"Observation2"]<-df1$Observation
contrasts(df1$Observation2, how.many=1)<-contr.poly
Then created a third linear model:
lm3<-lm(log(Variable)~Observation2+COV1+COV2, data=df1)
And conducted F tests with drop1 to compare F statistics from Type II ANOVAs between the two models: lm2, which contains both the linear and quadratic terms:
drop1(lm2, test="F")
lm3, which contains just the linear contrasts:
drop1(lm3, test="F")
This doesn't include a direct comparison of the models against each other, although the F statistic is higher (and p value accordingly lower) for the linear model, which would lead one to rely upon it instead of the quadratic model.
Upvotes: 4