jessi
jessi

Reputation: 1518

Why does stacked bar plot change when add facet in r ggplot2

I have data that are in several groups, and I want to display them in a faceted stacked bar chart. The data show responses to a survey question. When I look at them in the dataframe, they make sense, and when I plot them (without faceting), they make sense.

However the data appear to change when they are faceted. I have never had this problem before. I was able to re-create a change (not the exact same change) with some dummy data.

myDF <- data.frame(rep(c('aa','ab','ac'), each = 9),
               rep(c('x','y','z'),times = 9),
               rep(c("yes", "no", "maybe"), each=3, times=3),
               sample(50:200, 27, replace=FALSE))
colnames(myDF) <- c('place','program','response','number')
library(dplyr)
myDF2 <- myDF %>%
    group_by(place,program) %>%
    mutate(pct=(100*number)/sum(number))

The data in myDF are basically a count of responses to a question. The myDF2 only creates a percent of respondents with any particular response within each place and program.

library(ggplot2)
my.plot <-ggplot(myDF2, 
             aes(x=place, y=pct)) +
    geom_bar(aes(fill=myDF$response),stat="identity")

my.plot.facet <-ggplot(myDF2, 
                   aes(x=place, y=pct)) +
    geom_bar(aes(fill=myDF$response),stat="identity")+
    facet_wrap(~program)

I am hoping to see a plot that shows the proper "pct" for each "response" within each "program" and "place". However, my.plot.facet shows only one "response" per place.

example of my.plot and my.plot.facet

The data are not like that. For example, head(myDF2) shows that program 'aa' in place 'x' has both yes and no.

> head(myDF2)
Source: local data frame [6 x 5]
Groups: place, program

  place program response number      pct
1    aa       x      yes     69 18.35106
2    aa       y      yes     95 25.81522
3    aa       z      yes    192 41.64859
4    aa       x       no    129 34.30851
5    aa       y       no    188 51.08696
6    aa       z       no    162 35.14100

Upvotes: 0

Views: 1731

Answers (1)

jessi
jessi

Reputation: 1518

It turns out that ORDER matters here. The myDF2 is not a data frame anymore. It is a dplyr object. That means that ggplot2 is really struggling.

If the data need to be faceted by program, 'program' needs to be first called in the group_by()

Note that this is true here by looking at the inverse plot faceting.

my.plot.facet2 <-ggplot(myDF2, 
                       aes(x=program, y=pct)) +
   geom_bar(aes(fill=myDF2$response),stat="identity")+
   facet_wrap(~place)

produces:

my.plot.facet2

Upvotes: 1

Related Questions