BarneyC
BarneyC

Reputation: 529

ggplot will not plot missing category

I'm struggling with ggplot (I always do). There are a number of very similar questions about forcing ggplot to include zero value categories in legends - here and here (for example). BUT I (think I) have a slightly different requirement to which all my mucking about with scale_x_discrete and scale_fill_manual has not helped.

Requirement: As you can see; the right-hand plot has no data in the TM=5 category - so is missing. What I need is for that right plot to have category 5 shown on the axis but obviously with no points or box.

enter image description here

Current Plot Script:

#data
plotData <- data.frame("TM"    = c(3,2,3,3,3,4,3,2,3,3,4,3,4,3,2,3,2,2,3,2,3,3,3,2,3,1,3,2,2,4,4,3,2,3,4,2,3),
                       "Score" = c(5,4,4,4,3,5,5,5,5,5,5,3,5,5,4,4,5,4,5,4,5,4,5,4,4,4,4,4,5,4,4,5,3,5,5,5,5))
#vars
xTitle <- bquote("T"["M"])
v.I    <- plotData$TM
depVar <- plotData$Score

#plot
p <- ggplot(plotData, aes_string(x=v.I,y=depVar,color=v.I)) +
  geom_point() +
  geom_jitter(alpha=0.8, position = position_jitter(width = 0.2, height = 0.2)) +
  geom_boxplot(width=0.75,alpha=0.5,aes_string(group=v.I)) +
  theme_bw() +
  labs(x=xTitle) +
  labs(y=NULL) +
  theme(legend.position='none', 
        axis.text=element_text(size=10, face="bold"),
        axis.title=element_text(size=16))

Attempted Solutions:

  1. drop=False to scales (suggested by @Jarretinha here) totally borks margins and x-axis labels

    > plot + scale_x_discrete(drop=FALSE) + scale_fill_manual(drop=FALSE)

enter image description here

  1. Following logic from here and manually setting the labels in scale_fill_manual does nothing and results in the same right-hand plot from example above.

    > p + scale_fill_manual(values = c("red", "blue", "green", "purple", "pink"), labels = c("Cat1", "Cat2", "Cat3", "Cat4", "Cat5"), drop=FALSE)

  2. Playing with this logic and trying something with scale_x_discrete results in a change to category names on x-axis but the fifth is still missing AND the margins (as attempt 1) are borked again. BUT apparent that scale_x_discrete is important and NOT the whole answer

    > p + scale_x_discrete(limits = c("Cat1", "Cat2", "Cat3", "Cat4", "Cat5"), drop=FALSE)

enter image description here

ANSWER for above example courtesy of input from @Bouncyball & @aosmith

#data
plotData    <- data.frame("TM"    = c(3,2,3,3,3,4,3,2,3,3,4,3,4,3,2,3,2,2,3,2,3,3,3,2,3,1,3,2,2,4,4,3,2,3,4,2,3),
                       "Score" = c(5,4,4,4,3,5,5,5,5,5,5,3,5,5,4,4,5,4,5,4,5,4,5,4,4,4,4,4,5,4,4,5,3,5,5,5,5))
plotData$TM <- factor(plotData$TM, levels=1:5) # add correct (desired number of factors to input data)

#vars
xTitle <- bquote("T"["M"])
v.I    <- plotData$TM
depVar <- plotData$Score
myPalette <- c('#5c9bd4','#a5a5a4','#4770b6','#275f92','#646464','#002060')

#plot
ggplot(plotData, aes_string(x=v.I,y=depVar,color=v.I)) +
  geom_jitter(alpha=0.8, position = position_jitter(width = 0.2, height = 0.2)) +
  geom_boxplot(width=0.75,alpha=0.5,aes_string(group=v.I)) +
  scale_colour_manual(values = myPalette, drop=F) +  # new line added here
  scale_x_discrete(drop=F) + # new line added here
  theme_bw() +
  labs(x=xTitle) +
  labs(y=NULL) +
  theme(legend.position='none', 
        axis.text=element_text(size=10, face="bold"),
        axis.title=element_text(size=16))

enter image description here

Upvotes: 4

Views: 5346

Answers (1)

bouncyball
bouncyball

Reputation: 10761

Here's a workaround you could use:

# generate dummy data 
set.seed(123)
df1 <- data.frame(lets = sample(letters[1:4], 20, replace = T),
                  y = rnorm(20), stringsAsFactors = FALSE)
# define factor, including the missing category as a level
df1$lets <- factor(df1$lets, levels = letters[1:5])
# make plot
ggplot(df1, aes(x = lets, y = y))+
    geom_boxplot(aes(fill = lets))+
    geom_point(data = NULL, aes(x = 'e', y = 0), pch = NA)+
    scale_fill_brewer(drop = F, palette = 'Set1')+
    theme_bw()

enter image description here

Basically, we plot an "empty" point (i.e. pch = NA) so that the category shows up on the x-axis, but has no visible geom associated with it. We also define our discrete variable, lets as a factor with five levels when only four are present in the data.frame. The missing category is the letter e.

NB: You'll have to adjust the positioning of this "empty" point so that it doesn't skew your y axis.

Otherwise, you could use the result from this answer to avoid having to plot an "empty" point.

# generate dummy data 
set.seed(123)
df1 <- data.frame(lets = sample(letters[1:4], 20, replace = T),
                  y = rnorm(20), stringsAsFactors = FALSE)
# define factor, including the missing category as a level
df1$lets <- factor(df1$lets, levels = letters[1:5])
# make plot
ggplot(df1, aes(x = lets, y = y)) +
    geom_boxplot(aes(fill = lets)) +
    scale_x_discrete(drop = F) +
    scale_fill_brewer(drop = F, palette = 'Set1') +
    theme_bw()

enter image description here

Upvotes: 2

Related Questions