Reputation: 123
I would like a label to appear above each box in a boxplot generated by ggplot2
.
For example:
#Example data
test = c("A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B")
patient = c(1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3)
result = c(5, 7, 2 ,4, 6, 7, 3, 5, 5, 6, 2 ,3)
data <- tibble(test, patient, result)
#Labels I want to include
Alabs = c(1, 3, 500)
Blabs = c(8, 16, -32)
#Plot data
ggplot(data, aes(x = factor(patient), y = result, color = factor(test))) +
geom_boxplot(outlier.shape = 1)
Gives the plot:
I would like to print the first element of Alabs
above the red box for the first patient, the second element of Alabs
above the red box for the second patient, the first element of Blabs
above the blue box for the first patient, etc.
How do I do this?
Upvotes: 5
Views: 15898
Reputation: 36076
I would make a separate labels dataset for adding the labels.
labs = tibble(test = rep(LETTERS[1:2], each = 3),
patient = c(1, 2, 3, 1, 2, 3),
labels = c(1, 3, 500, 8, 16, -32) )
test patient labels
<chr> <dbl> <dbl>
1 A 1 1
2 A 2 3
3 A 3 500
4 B 1 8
5 B 2 16
6 B 3 -32
The above contains all the information about the x axis and the faceting variable. What it's missing is info about the location of the text on y axis. To put these above the boxes we could calculate the max for each factor combinations plus a small value for the y position (while geom_text
has a useful nudge_y
argument, it doesn't work while dodging).
I make the summaries per group via dplyr, and then join the y position values to the labels dataset.
library(dplyr)
labeldat = data %>%
group_by(test, patient) %>%
summarize(ypos = max(result) + .25 ) %>%
inner_join(., labs)
Now you can add the geom_text
layer, using the dataset of labels. To dodge these the same way as the boxplots, using position_dodge
. To keep letters from showing up in the legend I use show.legend = FALSE
.
ggplot(data, aes(x = factor(patient), y = result, color = test)) +
geom_boxplot(outlier.shape = 1) +
geom_text(data = labeldat, aes(label = labels, y = ypos),
position = position_dodge(width = .75),
show.legend = FALSE )
Upvotes: 4
Reputation: 994
Takes some cheating to get the labels into the same tibble:
data$labs=c(NA, 1, NA, 3, NA, 500, NA, 8, NA, 16, NA, -32) #line up the labels so each patient gets one: if you put the NAs first, labels will be at the bottom of the boxes
data$lab_x=c(NA, 0.75, NA, 1.75, NA, 2.75, NA, 1.25, NA, 2.25, NA, 3.25) #set x position for each one
Then run ggplot
:
ggplot(data, aes(x = factor(patient), y = result, color = factor(test))) +
geom_boxplot(outlier.shape = 1)+
geom_text(aes(label=labs, x=lab_x))
Upvotes: 1