Reputation: 41
Problem: I would like to learn how can I change an order of samples for which Tukey's test in R calculates means and assign corresponding letters. Very simple example is below.
I played with iris data and have found that there are differences in Sepal.Length among different Species. Here is the boxplot:
I conducted an ANOVA test and have found that differences are statistically significant.
> fit <- lm(Sepal.Length ~ Species, data = iris)
> summary(aov(fit))
Df Sum Sq Mean Sq F value Pr(>F)
Species 2 63.21 31.606 119.3 <2e-16 ***
Residuals 147 38.96 0.265
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Then I conducted Tukey's test and got following:
> library(agricolae)
> HSD.test(fit, "Species", group=T, console=T)
Study: fit ~ "Species"
HSD Test for Sepal.Length
Mean Square Error: 0.2650082
Species, means
Sepal.Length std r Min Max
setosa 5.006 0.3524897 50 4.3 5.8
versicolor 5.936 0.5161711 50 4.9 7.0
virginica 6.588 0.6358796 50 4.9 7.9
alpha: 0.05 ; Df Error: 147
Critical Value of Studentized Range: 3.348424
Honestly Significant Difference: 0.2437727
Means with the same letter are not significantly different.
Groups, Treatments and means
a virginica 6.588
b versicolor 5.936
c setosa 5.006
According the table of groups, HSD.test function sorts means into descending order and then assign letters. Thus, "virginica" have the largest mean, therefore it is the first in the table.
Questions: Is there any way to change the default sorting and assigning of letters? Can I sort samples in ascending order of means and then assign letters. An expected output is following:
a setosa 5.006
b versicolor 5.936
c virginica 6.588
Possible solution: In package multcomp there are two functions which can do it working together:
1 - glht
do Tukey's test
> an <- aov(fit)
> library(multcomp)
> glht(an, linfct = mcp(Species = "Tukey"))
General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Linear Hypotheses:
Estimate
versicolor - setosa == 0 0.930
virginica - setosa == 0 1.582
virginica - versicolor == 0 0.652
2 - cld
can provide me with letters assigned to Species
according levels of the factor iris$Species
> cld(glht(an, linfct = mcp(Species = "Tukey")))
setosa versicolor virginica
"a" "b" "c"
Unfortunately, glht
function do not show another data which can be useful and needed for creating of barplot (means, std, p-values). Of course, I can do it separately with another special functions, or just use both HSD.test
and cld
. But I would prefer to solve a problem with sorting of means in HSD.test
function and use only this one.
Upvotes: 4
Views: 4155
Reputation: 61
It is also possible to solve with multcompLetters() and TukeyHSD(). You should change the parameter "reversed"
library(multcompView)
fit <- aov(Sepal.Length ~ Species, data = iris)
tukey<-TukeyHSD(fit, ordered = T)
tukey_1<-multcompLetters2(Sepal.Length ~ Species,
tukey$Species[,"p adj"],
iris,reversed = T)
tukey_2<-multcompLetters2(Sepal.Length ~ Species,
tukey$Species[,"p adj"],
iris,reversed = F)
tukey_1
tukey_2
tapply(iris$Sepal.Length, iris$Species, mean)
Upvotes: 0
Reputation: 335
First of all, thanks for the function. It was what I was looking for. But I think that there is a mistake in
res <- vector(mode = "character", length=length(trt)),
it should be
res <- vector(mode = "character", length=length("trt"))
Upvotes: -1
Reputation: 405
I notice it's a bit late to answer this question. However I ran into the exact same problem and like to share my solution as future reference. Hope it helps someday someone.
One could use multcompLetters()
for example with the results from TukeyHSD()
. However, this doesn't allow an arbitrary ordering of the result and is not so easy to use.
Since I needed an arbitrary order I wrote my own function, which takes a vector of letters like returned from HSD.test
and swaps the letters in a way the result is nice. Meaning letters first in alphabet appearing first.
library(agricolae)
reorder<-function(inV){
collapsed <- paste(inV,sep="",collapse = "")
u <- unique(strsplit(collapsed,"")[[1]])
if(length(u)<2){
return(inV)
}
u <- u[order(u)]
m <- matrix(nrow=NROW(inV),ncol=length(u))
m[]<-F
for(i in 1:length(inV)){
s <- strsplit(inV[i],"")[[1]]
index <- match(s,u)
m[i,index] <- T
}
for(i in 1:(length(u)-1)){
firstColT <- match(T,m[,i])[1] #first row with true in current column
firstT <- match(T,rowSums(m[,i:length(u)] > 0))[1] #first row with true in rest
if(firstT < firstColT){
colT <- match(T,m[firstT,i:length(u)])[1]
colT <- colT + i - 1 #correct index for leftout columns in match
tmp <- m[,colT]
m[,colT] <- m[,i]
m[,i] <- tmp
}
}
res <- vector(mode = "character", length=length(trt))
for(i in 1:length(inV)){
l <- u[m[i,]]
res[i] <- paste(l,sep="",collapse = "")
}
return(res)
}
fit <- lm(Sepal.Length ~ Species, data = iris)
a <- HSD.test(fit, "Species", group=T, console=F)$groups
a <- a[rev(rownames(a)),] #order the result the way you want
a$M <- reorder(as.character(a$M))
For the example it is a bit of an overkill but it should also work for more complicated cases.
Upvotes: 2