ggplot2 histogram facet - include only relevant entries per facet plot

Question

What I want to accomplish is really simple conceptually, but I don't find a way to do it, unless I do 2 different plots and put them together with grid.arrange, which is not my preferred solution. I have data from entries of length 9 and length 10 to be faceted in the plot, but I only want the entries of length 9 on the left and the entries of length 10 on the right, not all everywhere with empty histogram bars.

MWE

library(reshape2)
library(ggplot2)
my.df <- data.frame(oligo=rep(c("NSSCMGSMNR","SSCMGSMNR","SSCMGSMNRR","VVGAGDVGK",
                                "VVGAVGVGK","VVVGAGDVGK","VVVGAVGVGK"), each=4), 
                    a11.a24=rep(c("-/-","+/-","-/+","+/+"), 7), 
                    a11=rep(c("-","+","-","+"), 7), a24=rep(c("-","-","+","+"), 7),
                    freq1=c(0,3,0,1,0,3,0,1,0,3,0,1,2,3,2,0,4,2,1,4,2,3,2,0,4,2,1,4), 
                    freq2=c(0,4,0,4,0,4,0,4,0,4,0,4,4,3,4,3,5,6,5,6,4,3,4,3,5,6,5,6), 
                    freq3=c(3,3,1,1,3,3,1,1,3,3,1,1,5,5,2,2,6,6,5,5,5,5,2,2,6,6,5,5), 
                    len=rep(c(10,9,10,9,9,10,10), each=4))
my.df <- with(my.df, my.df[order(my.df$len, my.df$freq1, decreasing=T),])
my.df.m <- melt(my.df, measure.vars=c('a11.a24', 'a11', 'a24'))
my.df.m$oligo <- factor(my.df.m$oligo, levels=unique(my.df.m$oligo))
ggplot(my.df.m, aes(x=oligo)) +
  facet_grid(variable~len, scales="free_y") +
  geom_histogram(data=subset(my.df.m, variable=='a11.a24'),
    aes(y=freq1, fill=value), stat="identity", position=position_dodge()) +
  geom_histogram(data=subset(my.df.m, variable=='a11'),
    aes(y=freq2, fill=value), stat="identity", position=position_dodge()) +
  geom_histogram(data=subset(my.df.m, variable=='a24'),
    aes(y=freq3, fill=value), stat="identity", position=position_dodge()) +
  ggtitle("Number of patients with a given oligo") +
  theme(axis.text.x=element_text(angle=45, hjust=1, size=7), 
        legend.title=element_blank(), legend.position="right", 
        legend.background=element_blank(), legend.box.just="left",
        plot.title=element_text(size=15, face="bold", colour="black", vjust=1.5)) +
  scale_y_continuous(name = "num. patients") +
  scale_x_discrete(name = "oligo")

which produces: enter image description here

All entries are in both X axes, and the bars corresponding to the ones of length 10 are empty on the left, and the bars corresponding to the ones of length 9 are empty on the right... Could I have only entries of length 9 on the left and only entries of length 10 on the right? Thanks

The figure produced brings me to another question regarding the order of the entries... why 9 comes before 10 in the facets, but when arranging the entries in each X axis the ones of length 10 come before the ones of length 9?? In case I wanted to keep all entries everywhere, how could I get a correct order everywhere (9 before 10)??

Note that the solution should be universal, this example is only lengths 9 and 10, but I could have any combination of lengths 8, 9, 10 and 11...

user2034412 · Accepted Answer

Change the scales to free on both axes:

facet_grid(variable~len, scales="free") (instead of "free_y")

For your second question, the entries are coerced to a factor vector, and the order happens to have entries with length 10 come before length 9. You can reorder them by reordering the levels in the oligo factor.

If you want to do it based on the len column:

my.df.m$oligo = factor(my.df.m$oligo, 
                       levels(my.df.m$oligo)[unique(my.df.m$oligo[order(my.df.m$len)])])

ggplot2 histogram facet - include only relevant entries per facet plot

Answers (1)

Related Questions