Reputation: 187
I have encountered an issue with geom_boxplot and ggplot2 when a specific group has only a few samples in it. When there are a small number of samples in a group, the geom_boxplot command from ggplot2 still generates the box and whiskers, creating a view that gives quartiles even when they are not appropriate.
I am hopeful someone knows a way to force ggplot2 to not draw the box and whiskers for groups with a small number of samples.
Here is a toy example to show the issue.
###Example
library(ggplot2)
#Set DF for plot
Num <- c(150, 196, 182, 224, 111, 129, 80, 183, 130, 171, 169, 165)
Group <- c("Three", "Three", "One", "Two", "One", "Two", "One", "Two", "One", "Two", "One", "Two")
DF <- data.frame(Num, Group)
#Make figure
p1 <- ggplot(DF, aes(Group, Num))
p1 + geom_boxplot(aes(fill=Group)) + scale_color_manual(values = c("#CC0000", "#0000E5", "#008000")) + theme_minimal() + scale_shape_manual(values = c(16,17,15)) +
geom_point(size = 2.5) + scale_x_discrete(limits=c("One", "Two", "Three")).
Currently, this outputs the following figure, but there are only two samples under the "Three" group. Is there a way to force a specific group to only show the points if there are less than N samples in a group?
For this figure, I would expect groups One and Two to look like they do, but would expect group Three to only have the two points, nothing else. Any help is greatly appreciated.
Upvotes: 1
Views: 2545
Reputation: 145755
The simplest solution is certainly to give geom_boxplot
only the rows of data you want to plot by pre-computing the number of points:
DF$n = with(DF, ave(Num, Group, FUN = length))
## if you like dplyr
# DF = group_by(DF, Group) %>% mutate(n = n())
ggplot(DF, aes(Group, Num)) +
geom_boxplot(data = subset(DF, n > 2), aes(fill = Group)) +
scale_color_manual(values = c("#CC0000", "#0000E5", "#008000")) +
theme_minimal() +
scale_shape_manual(values = c(16, 17, 15)) +
geom_point(size = 2.5) +
scale_x_discrete(limits = c("One", "Two", "Three"))
Upvotes: 2