Reputation: 307
I want to do a trend analysis for an ANOVA that has both btw-Ss and within-Ss factors.
The btw factors are "treatments" The within factors are "trials".
test.data <- data.frame(sid = rep(c("s1", "s2", "s3", "s4", "s5"), each = 4),
treatments = rep(c("a1", "a2"), each = 20),
trials = rep(c("b1", "b2", "b3", "b4"), 10),
responses = c(3,5,9,6,7,11,12,11,9,13,14,12,4,8,11,7,1,3,5,4,5,6,11,7,10,12,18,15,10,15,15,14,6,9,13,9,3,5,9,7))}
The ANOVA matches the one in the textbook (Keppel, 1973) exactly:
aov.model.1 <- aov(responses ~ treatments*trials + Error(sid/trials), data=tab20.09)
What I am having trouble with is the trend analysis. I want to look at the linear, quadratic, and cubic trends for “trials”. Would also be nice to look at these same trends for “treatments x trials”.
I have set up the contrasts for the trend analyses as:
contrasts(tab20.09$trials) <- cbind(c(-3, -1, 1, 3), c(1, -1, -1, 1), c(-1, 3, -3, 1))
contrasts(tab20.09$trials)
[,1] [,2] [,3]
b1 -3 1 -1
b2 -1 -1 3
b3 1 -1 -3
b4 3 1 1
for the linear, quadratic, and cubic trends.
According to Keppel the results for the trends should be:
TRIALS:
SS df MS F
(Trials) (175.70) 3
Linear 87.12 1 87.12 193.60
Quadratic 72.90 1 72.90 125.69
Cubic 15.68 1 15.68 9.50
TREATMENTS X TRIALS
SS df MS F
(Trtmt x Trials)
(3.40) 3
Linear 0.98 1 0.98 2.18
Quadratic 0.00 1 0.00 <1
Cubic 2.42 1 2.42 1.47
ERROR TERMS
(21.40) (24)
Linear 3.60 8 0.45
Quadratic 4.60 8 0.58
Cubic 13.20 8 1.65
I have faith in his answers as once upon the time I had to derive them myself using a 6 function calculator supplemented by paper and pencil. However, when I do this:
contrasts(tab20.09$trials) <- cbind(c(-3, -1, 1, 3), c(1, -1, -1, 1), c(-1, 3, -3, 1))
aov.model.2 <- aov(responses ~ treatments*trials + Error(sid/trials), data=tab20.09)
summary(lm(aov.model.2))
what I get seems not to make sense.
summary(lm(aov.model.2))
Call:
lm(formula = aov.model.2)
Residuals:
ALL 40 residuals are 0: no residual degrees of freedom!
Coefficients: (4 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.750e+00 NA NA NA
treatmentsa2 3.500e+00 NA NA NA
trials1 6.500e-01 NA NA NA
trials2 -1.250e+00 NA NA NA
trials3 -4.500e-01 NA NA NA
sids10 -3.250e+00 NA NA NA
sids2 4.500e+00 NA NA NA
sids3 6.250e+00 NA NA NA
sids4 1.750e+00 NA NA NA
sids5 -2.500e+00 NA NA NA
sids6 -2.000e+00 NA NA NA
sids7 4.500e+00 NA NA NA
sids8 4.250e+00 NA NA NA
sids9 NA NA NA NA
treatmentsa2:trials1 2.120e-16 NA NA NA
treatmentsa2:trials2 -5.000e-01 NA NA NA
treatmentsa2:trials3 5.217e-16 NA NA NA
trials1:sids10 1.500e-01 NA NA NA
trials2:sids10 7.500e-01 NA NA NA
trials3:sids10 5.000e-02 NA NA NA
trials1:sids2 -1.041e-16 NA NA NA
trials2:sids2 -2.638e-16 NA NA NA
trials3:sids2 5.000e-01 NA NA NA
trials1:sids3 -1.500e-01 NA NA NA
trials2:sids3 -2.500e-01 NA NA NA
trials3:sids3 4.500e-01 NA NA NA
trials1:sids4 -5.000e-02 NA NA NA
trials2:sids4 -7.500e-01 NA NA NA
trials3:sids4 1.500e-01 NA NA NA
trials1:sids5 -1.000e-01 NA NA NA
trials2:sids5 5.000e-01 NA NA NA
trials3:sids5 3.000e-01 NA NA NA
trials1:sids6 -1.000e-01 NA NA NA
trials2:sids6 5.000e-01 NA NA NA
trials3:sids6 -2.000e-01 NA NA NA
trials1:sids7 4.000e-01 NA NA NA
trials2:sids7 5.000e-01 NA NA NA
trials3:sids7 -2.000e-01 NA NA NA
trials1:sids8 -5.000e-02 NA NA NA
trials2:sids8 2.500e-01 NA NA NA
trials3:sids8 6.500e-01 NA NA NA
trials1:sids9 NA NA NA NA
trials2:sids9 NA NA NA NA
trials3:sids9 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 39 and 0 DF, p-value: NA
Any ideas what I am doing wrong? I suspect there is some problem with “lm” and the ANOVA but I don’t know what and I don’t know how to put in my trend analyses.
###### MORE DETAILS in response to ssdecontrol's responseWell, "trials" is a factor as it codes four levels of experience that are being manipulated. Likewise "sid" is the "subject identification number" that is definitely "nominal" not "ordinal" or "interval". Subjects are pretty much always treated as Factors in ANOVAS.
However, I did try both of these changes, but it greatly distorted the ANOVA (try it yourself and compare). Likewise, it didn't seem to help. PERHAPS MORE DIRECTLY RELEVANT, when I try to create and apply my contrasts I am told that it cannot be done as my numerics need to be factors:
contrasts(tab20.09$trials) <- cbind(c(-3, -1, 1, 3), c(1, -1, -1, 1), c(-1, 3, -3, 1))
Error in `contrasts<-`(`*tmp*`, value = c(-3, -1, 1, 3, 1, -1, -1, 1, :
contrasts apply only to factors
STARTING OVER I seem to make more progress using contr.poly as in
contrasts(tab20.09$trials) <- contr.poly(levels(tab20.09$trials))
The ANOVA doesn't change at all. So that is good and when I do:
lm.model <- lm(responses ~ trials, data = tab20.09)
summary.lm(lm.model)
I get basically the same pattern as Keppel.
BUT, as I am interested in the linear trend of the interaction (treatments x trials), not just on trials, I tried this:
lm3 <- lm(responses ~ treatments*trials, data = tab20.09)
summary.lm(lm3)
and the ME of "trials" goes away . . .
In Keppel’s treatment, he calculated separate error terms for each contrast (i.e., Linear, Quadratic, and Cubic) and used that on both the main effect of “trial” as well as on the “treatment x trial” interaction.
I certainly could hand calculate all of these things again. Perhaps I could even write R functions for the general case; however, it seems difficult to believe that such a basic core contrast for experimental psychology has not yet found an R implementation!!??
Any help or suggestions would be greatly appreciated. Thanks. W
Upvotes: 0
Views: 1255
Reputation: 13883
It looks like trials
and sids
are factors, but you are intending for them to be numeric/integer. Run sapply(tab20.09, class)
to see if that's the case. That's what the output means; instead of fitting a continuous/count interaction, it's fitting a dummy variable for each level of each variable and computing all of the interactions between them.
To fix it, just reassign tab20.09$trials <- as.numeric(tab20.09$trials)
and tab20.09$sids <- as.numeric(tab20.09$sids)
in list syntax, or you can use matrix syntax like tab20.09[, c("trials", "sids")] <- apply(tab20.09[, c("trials", "sids")], 2, as.numeric)
. The first one is easier in this case, but you should be aware of the second one as well.
Upvotes: 0