Reputation: 371
Ok, so I think I'm pretty close with this, but I'm getting an error when I try to construct my box plot at the end. My goal is to place letters denoting statistical relationships among the time points above each boxplot. I've seen two discussion of this on this site, and can reproduce the results from their code, but can't apply it to my dataset.
Packages
library(ggplot2)
library(multcompView)
library(plyr)
Here is my data:
dput(WaterConDryMass)
structure(list(ChillTime = structure(c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), .Label = c("Pre_chill",
"6", "13", "24", "Post_chill"), class = "factor"), dmass = c(0.22,
0.19, 0.34, 0.12, 0.23, 0.33, 0.38, 0.15, 0.31, 0.34, 0.45, 0.48,
0.59, 0.54, 0.73, 0.69, 0.53, 0.57, 0.39, 0.8)), .Names = c("ChillTime",
"dmass"), row.names = c(NA, -20L), class = "data.frame")
ANOVA and Tukey Post-hoc
Model4 <- aov(dmass~ChillTime, data=WaterConDryMass)
tHSD <- TukeyHSD(Model4, ordered = FALSE, conf.level = 0.95)
plot(tHSD , las=1 , col="brown" )
Function:
generate_label_df <- function(TUKEY, flev){
# Extract labels and factor levels from Tukey post-hoc
Tukey.levels <- TUKEY[[flev]][,4]
Tukey.labels <- multcompLetters(Tukey.levels)['Letters']
plot.labels <- names(Tukey.labels[['Letters']])
boxplot.df <- ddply(WaterConDryMass, flev, function (x) max(fivenum(x$y)) + 0.2)
# Create a data frame out of the factor levels and Tukey's homogenous group letters
plot.levels <- data.frame(plot.labels, labels = Tukey.labels[['Letters']],
stringsAsFactors = FALSE)
# Merge it with the labels
labels.df <- merge(plot.levels, boxplot.df, by.x = 'plot.labels', by.y = flev, sort = FALSE)
return(labels.df)
}
Boxplot:
ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
geom_blank() +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
theme(plot.title = element_text(hjust = 0.5, face='bold')) +
annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
geom_boxplot(fill = 'green2', stat = "boxplot") +
geom_text(data = generate_label_df(tHSD), aes(x = plot.labels, y = V1, label = labels)) +
geom_vline(aes(xintercept=4.5), linetype="dashed") +
theme(plot.title = element_text(vjust=-0.6))
Error:
Error in HSD[[flev]] : invalid subscript type 'symbol'
Upvotes: 2
Views: 19291
Reputation: 4309
I think I found the tutorial you are following, or something very similar. You would probably be best to copy and paste this whole thing into your work space, function and all, to avoid missing a few small differences.
Basically I have followed the tutorial (http://www.r-graph-gallery.com/84-tukey-test/) to the letter and added a few necessary tweaks at the end. It adds a few extra lines of code, but it works.
generate_label_df <- function(TUKEY, variable){
# Extract labels and factor levels from Tukey post-hoc
Tukey.levels <- TUKEY[[variable]][,4]
Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])
#I need to put the labels in the same order as in the boxplot :
Tukey.labels$treatment=rownames(Tukey.labels)
Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
return(Tukey.labels)
}
model=lm(WaterConDryMass$dmass~WaterConDryMass$ChillTime )
ANOVA=aov(model)
# Tukey test to study each pair of treatment :
TUKEY <- TukeyHSD(x=ANOVA, 'WaterConDryMass$ChillTime', conf.level=0.95)
labels<-generate_label_df(TUKEY , "WaterConDryMass$ChillTime")#generate labels using function
names(labels)<-c('Letters','ChillTime')#rename columns for merging
yvalue<-aggregate(.~ChillTime, data=WaterConDryMass, mean)# obtain letter position for y axis using means
final<-merge(labels,yvalue) #merge dataframes
ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
geom_blank() +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
theme(plot.title = element_text(hjust = 0.5, face='bold')) +
annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
geom_boxplot(fill = 'green2', stat = "boxplot") +
geom_text(data = final, aes(x = ChillTime, y = dmass, label = Letters),vjust=-3.5,hjust=-.5) +
geom_vline(aes(xintercept=4.5), linetype="dashed") +
theme(plot.title = element_text(vjust=-0.6))
Upvotes: 4