Connor
Connor

Reputation: 31

Sorting a side by side bar graph based on one bar, ggplot2

I am trying to create a side by side bar graph in ggplot2 sorted numerically by the left bar of the side by side graph. I've tried the reorder function, but that seems to sort by the average of the two bars and not just one of them.

Example side by side bar plot

library(ggplot2)

a<-(c(1:10))
e<-c("group a","group b", "group c", "group d", "group e", "group a","group b", "group c", "group d", "group e")
fillvariable<-c(1,2,2,1,2,2,1,1,2,1)
data<-cbind(a,e,fillvariable)
data<-as.data.frame(data)
data

plot <- ggplot(data, aes(x=e, y=a,fill=factor(fillvariable))) + geom_bar(stat = "identity", position = 'dodge')

plot

I would like to sort the bars numerically by the left (red) bar (see sample bar graph). My real x axis has many groups so it would be less than idea to type each label and set the order that way. Does anyone have a suggestion on how to do this using a function in R?

I'm realizing that I am having another issue too. I'm sure how to make sure my bar plot puts the 1 fill factor on the left every time. Any advice for that would be appreciated as well.

sample bar output

Upvotes: 3

Views: 1628

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

I would do it like this:

# isolate the subset of data you want to order by
subset_to_order = subset(data, fillvariable == 1)
# use reorder to reorder the factor
subset_to_order$e = with(subset_to_order, reorder(e, a))

# apply the same order to the whole data
data$e = factor(data$e, levels = levels(subset_to_order$e))

plot <- ggplot(data, aes(x=e, y=a,fill=factor(fillvariable))) + geom_col(position = 'dodge')
plot

enter image description here

MrFlick's comment shows a similar way. Basically your question is the same as the R-FAQ about ordering bars in ggplot2, but with a twist that the order is determined by a subset of the data instead of the whole data. We use the same solution, adapting it to find a way to ignore the parts of the data that don't matter for the ordering. I did it with an explicit subset, MrFlick's comment does it by zeroing out the other parts with ifelse. Both work just fine.

I've also switched your geom_bar(stat = "identity") togeom_col(), which is preferred in the last fewggplot` releases.

This does depend on not messing up the classes of your columns. When you use cbind, you implicitly convert to matrix which is bad when you mix numeric and string/factor columns. Keep it as a data frame and everything should be fine.

Upvotes: 4

Related Questions