Reputation: 153
I'm trying to reproduce the example in the bottom of the following page of the NIST:
http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc233.htm
With Minitab and SPSS it is not an issue but with R I don't get it.
I tried:
mydata<-read.csv("mydata.csv",header=T,sep=";")
summary(aov(Measure~factor(Machine)/factor(Operator),data=mydata))
But the F values are not correct.
Thanks for your help
PS: Here is the dataset of the NIST (mydata.csv)
Operator;Machine;Replicate;Measure
1;1;1;0.125
1;1;2;0.127
1;1;3;0.125
1;1;4;0.126
1;1;5;0.128
1;2;1;0.118
1;2;2;0.122
1;2;3;0.12
1;2;4;0.124
1;2;5;0.119
1;3;1;0.123
1;3;2;0.125
1;3;3;0.125
1;3;4;0.124
1;3;5;0.126
1;4;1;0.126
1;4;2;0.128
1;4;3;0.126
1;4;4;0.127
1;4;5;0.129
1;5;1;0.118
1;5;2;0.129
1;5;3;0.127
1;5;4;0.12
1;5;5;0.121
2;1;1;0.124
2;1;2;0.128
2;1;3;0.127
2;1;4;0.126
2;1;5;0.129
2;2;1;0.116
2;2;2;0.125
2;2;3;0.119
2;2;4;0.125
2;2;5;0.12
2;3;1;0.122
2;3;2;0.121
2;3;3;0.124
2;3;4;0.126
2;3;5;0.125
2;4;1;0.126
2;4;2;0.129
2;4;3;0.125
2;4;4;0.13
2;4;5;0.124
2;5;1;0.125
2;5;2;0.123
2;5;3;0.114
2;5;4;0.124
2;5;5;0.117
Upvotes: 0
Views: 1011
Reputation: 49660
You can specify the model with random terms and how to divide to get the F ratios by using the Error
term in the formula (I changed Machine and Operator to factors in the data):
> summary(aov(Measure~Machine + Error(interaction(Machine,Operator)),data=mydat))
Error: interaction(Machine, Operator)
Df Sum Sq Mean Sq F value Pr(>F)
Machine 4 0.0003033 7.583e-05 20.38 0.00269 **
Residuals 5 0.0000186 3.720e-06
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 40 0.000346 8.65e-06
Upvotes: 1
Reputation: 37889
The summary of the aov
function:
> summary(aov(Measure~factor(Machine)/factor(Operator),data=mydata))
Df Sum Sq Mean Sq F value Pr(>F)
factor(Machine) 4 0.0003033 7.583e-05 8.766 3.52e-05 ***
factor(Machine):factor(Operator) 5 0.0000186 3.720e-06 0.430 0.825
Residuals 40 0.0003460 8.650e-06
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
As you say in your question 8.766
is different to 20.38
that you expected.
However, this is not an error of the function. This is because of how the aov
function works. As the documentations says (?aov
):
Fit an analysis of variance model by a call to lm for each stratum.
Notice the 'each' in the above line. This means that the aov
function always calculates the F statistic (F value) based on the variable MSE / Residuals MSE
. As you can see 8.766
is the result of the division between factor(Machine)
and Residuals
i.e. 75.83 / 8.65 = 8.766
.
This can be seen in the source code as well if you want to dig a little deeper (type summary.aov
on the console) where it says:
if (rdf > 0L) {
TT <- ms/ms[nt]
TP <- pf(TT, df, rdf, lower.tail = FALSE)
TT[nt] <- TP[nt] <- NA
x$"F value" <- TT
x$"Pr(>F)" <- TP
}
And this is also why the F value
for factor(Machine):factor(Operator)
is correct.
In case you want to find the F-statistic and the associated probability, of the division between factor(Machine)
and factor(Machine):factor(Operator)
just calculate it on your own as:
a <- summary(aov(Measure~factor(Machine)/factor(Operator),data=mydata))
and
> a[[1]]$'Mean Sq'[1] / a[[1]]$'Mean Sq'[2] #which is what you want
[1] 20.38441
The corresponding probability Pr(>F)
:
> pf(20.38441, df1=4, df2=5, lower.tail = FALSE)
[1] 0.00269263
Upvotes: 0