Vasily A
Vasily A

Reputation: 8626

ggplot layer breaks order when placed first

My plot has categorical X axis and multiple types of elements to draw:

  dt1 <- fread('
    ID    type value
    a1     bar   40 
    a1   point   30 
    b1     bar   50 
    b1   point   20 
    c1     bar   30 
    c1   point   50 
    c1   point   20 
    d1   point   30 
    d1   point   50 
    e1    none   50 
    a2     bar   45 
    a2   point   30 
    ')
  
  # I want some custom order on the plot:
  dt1[, ID:=factor(ID, levels=unique(ID[order(value)]))]
  #  here it will be  b1 - c1 - a1 - d1 - a2

If I build a plot with geom_point() followed by geom_bar(), the order of X axis is correct:

ggplot(dt1, aes(x=ID,y=value))+
  geom_point(data=dt1[type=='point',], size=5, col='red') +
  geom_bar(  data=dt1[type=='bar',], stat='identity', alpha=0.5)

correct order

But if I have geom_bar() as a first layer followed by geom_point(), it ignores the levels of my x (ID) variable and reorders it alphabetically:

ggplot(dt1, aes(x=ID,y=value))+
  geom_bar(  data=dt1[type=='bar',], stat='identity', alpha=0.5)+
  geom_point(data=dt1[type=='point',], size=5, col='red')

wrong bars order

(Note that geom_bar() as the single layer has the correct order, the problem occurs only when it is followed by another level!) Why does it happen and how to fix it? I found one workaround with adding scale_x_discrete(drop=FALSE) but I don't like it because it adds categories that are not supposed to be there:

ggplot(dt1, aes(x=ID,y=value))+
  geom_bar(  data=dt1[type=='bar',], stat='identity', alpha=0.5)+
  geom_point(data=dt1[type=='point',], size=5, col='red')+
  scale_x_discrete(drop=F)

see "e1" added

Upvotes: 2

Views: 256

Answers (1)

lroha
lroha

Reputation: 34406

This happens because the later layer contains levels not present in the earlier layer (recall that unused levels are dropped by default so after a layer is plotted any unused levels are removed). ggplot() doesn't know how to merge (what becomes) two different factors so they are converted to a character vector (and then back to a factor) before being plotted. You can use the limits argument in scale_x_discrete() to specify the desired order.

library(ggplot2)
library(data.table)

ggplot() + 
  aes(x=ID,y=value) +
  geom_col(data=dt1[type=='bar',], alpha=0.5) +
  geom_point(data=dt1[type=='point',], size=5, col='red') +
  scale_x_discrete(limits = levels(droplevels(dt1$ID[dt1$type %in% c("bar", "point")])))

You can do it a little more neatly by subsetting your data before plotting:

dt2 <- dt1[type != "none"]
dt2[, ID:=factor(ID, levels=unique(ID[order(value)]))]

ggplot() +
  aes(x=ID,y=value) +
  geom_col(data=dt1[type=='bar',], alpha=0.5) +
  geom_point(data=dt1[type=='point',], size=5, col='red') +
  scale_x_discrete(limits = levels(dt2$ID))

Upvotes: 3

Related Questions