Tyler Rinker
Tyler Rinker

Reputation: 110044

ggplot unexplained outcome

I started making a reproducible example to ask another question and can't even get past that. Anyway I am attempting to plot categorical data into a faceted bar plot. So I made my own data set using CO3 (code at bottom). Just plotting x by itself seems normal: enter image description here

but then gets funky when I try to facet. Showing everything is all equal. enter image description here

That doesn't make sense as it would indicate every sub group had equal proportions of the out come that isn't evidenced by an ftable of the data:

                   Type Quebec Mississippi
outcome Treatment                         
none    nonchilled           7           6
        chilled              4           7
some    nonchilled           6           4
        chilled              5           5
lots    nonchilled           5           4
        chilled              6           3
tons    nonchilled           3           7
        chilled              6           6

What am I doing wrong?

library(ggplot2)
set.seed(10)
CO3 <- data.frame(CO2[, 2:3], outcome=factor(sample(c('none', 'some', 'lots', 'tons'), 
           nrow(CO2), rep=T), levels=c('none', 'some', 'lots', 'tons')))
CO3
x <- ggplot(CO3, aes(x=outcome)) + geom_bar(aes(x=outcome))
x
x  + facet_grid(Treatment~., margins=TRUE)

with(CO3, ftable(outcome, Treatment, Type))

EDIT: This problem Brian describes is an easy one to find yourself in when you need to stack data. To over come this until the next version of ggplot (I assume Hadley is aware of this problem) I have created a silly little convenience function to add an ID column to a data frame quickly:

IDer <- function(dataframe, id.name="id"){
    DF <- data.frame(c=1:nrow(dataframe), dataframe)
    colnames(DF)[1] <- id.name
    return(DF)
}

IDer(mtcars)

Upvotes: 3

Views: 193

Answers (1)

Brian Diggs
Brian Diggs

Reputation: 58855

There is a bug in the 0.9.0 release of ggplot2 regarding facet_grid() and duplicated rows. See https://github.com/hadley/ggplot2/issues/443

A workaround is to add a dummy column to break the duplication.

CO3$dummy <- 1:nrow(CO3)

ggplot(CO3, aes(x=outcome)) + 
  geom_bar(aes(x=outcome)) + 
  facet_grid(Treatment~., margins=TRUE)

enter image description here

Upvotes: 5

Related Questions