Matifou
Matifou

Reputation: 8880

ggplot2: geom_bar, computing facet-wise percentage

I would like to use the geo_bar with facets, obtaining percentage instead of absolute counts, but percentage should be relative to each facet, not relative to the overall count.

This has been discussed a lot (example), suggesting to use geom_bar(aes(y = (..count..)/sum(..count..))). This won't work with facets (i.e. will give total count). A better solution has been suggested, using stat_count(mapping = aes(x=x_val, y=..prop..)) instead.

This seems to work if x is numeric, but not if x is character: all bars are 100%! Why? Am I doing something wrong? Thanks!

library(tidyverse)
df <- data_frame(val_num = c(rep(1, 60), rep(2, 40), rep(1, 30), rep(2, 70)),
             val_cat = ifelse(val_num==1, "cat", "mouse"),
             group=rep(c("A", "B"), each=100))

#works with numeric 
ggplot(df) + stat_count(mapping = aes(x=val_num, y=..prop..)) + facet_grid(group~.)

# does not work? 
ggplot(df) + stat_count(mapping = aes(x=val_cat, y=..prop..)) + facet_grid(group~.)

Upvotes: 2

Views: 2709

Answers (1)

eipi10
eipi10

Reputation: 93761

Adding group=group tells ggplot to calculate proportions by group, rather than the default, which would be separately for each level of val_cat.

ggplot(df) + 
  stat_count(aes(x=val_cat, y=..prop.., group=group)) + 
  facet_grid(group~.)

enter image description here

When the x-variable is continuous, it looks like stat_count by default calculates percentages over all data in the facet. However, when the x-variable is categorical, stat_count calculates percentages separately within each x level. See what happens with the following examples:

Adding val_num as the group aesthetic causes percentages to be calculated within each x level instead of over all values in a facet.

ggplot(df) + 
  stat_count(aes(x=val_num, y=..prop.., group=val_num)) + 
  facet_grid(group~.)

Turning val_num into a factor likewise causes percentages to be calculated within each x level instead of over all values in a facet.

ggplot(df) + 
  stat_count(aes(x=factor(val_num), y=..prop..)) + 
  facet_grid(group~.)

Upvotes: 5

Related Questions