Reputation: 637
I am sure this question has been asked before. but I was unable to find anything similiar. So consider a simple worked example
We create random data and then create boxplots:
set.seed(123456)
Ax <- sample(1:3, size = 75, replace = T)
Fac <- sample(LETTERS[1:4], 75, replace = T)
yvalue <- runif(75)
df1 <- data.frame(Ax, Fac, yvalue)
library(ggplot2)
ggplot(df1, aes(factor(Ax), yvalue, colour = Fac)) +
geom_boxplot()
But we review our data closer:
table(df1$Ax, df1$Fac)
I want to create a boxplot plot like the one above, but when the group sizes (n=) is less than 6, then either:
That is for the following data shaded in the red circles
Upvotes: 1
Views: 180
Reputation: 17648
You can try:
include column of occurence using ave()
df1$length <- ave(df1$yvalue, interaction(df1$Ax, df1$Fac), FUN=length)
Now for instance adjust the alpha to plot uncoloured/shaded boxes:
ggplot(df1, aes(factor(Ax), yvalue, fill = Fac, alpha=factor(ifelse(df1$length < 6 ,0.5, 1)))) +
geom_boxplot()
Upvotes: 2
Reputation: 1099
If you don't care about have placeholder spaces for where the boxplots used to be you can simply just remove the observations that don't meet your criteria. The example below makes use of dplyr for the data manipulation
library(dplyr)
library(ggplot2)
### Identify all groups that have > 5 observations per group
df2 <- df1 %>% group_by(Fac , Ax) %>% summarise( n = n()) %>% filter ( n > 5)
### Only keep groups that meet our criteria
df3 <- df1 %>% semi_join(df2 , by = c("Fac" , "Ax") )
ggplot(df3, aes(factor(Ax), yvalue, colour = Fac)) +
geom_boxplot()
Upvotes: 1