Robbie
Robbie

Reputation: 275

Percentages adding up within rather than across groups with the scales package in a ggplot2 bar plot?

I want to compare automatic and manual cars (mtcars dataset) using a bar plot (ggplot2).

I've got a plot that shows counts on the y-axis (left-hand plot below) but would instead want one with percentages on the y-axis.

I want this so that I can compare cars more easily and say, eg 'models with four cylinders make up x percent of automatic cars but only y percent of manual cars'.

I've tried using this scales package tutorial https://www.tutorialspoint.com/how-to-create-a-bar-plot-using-ggplot2-with-percentage-on-y-axis-in-r for a neat way of changing counts to percentages (right-hand plot below).

The problems that the percentages add up across both the automatic and the manual cars. I want the percentages to add up within automatic and manual cars respectively.

Is there some way of doing that using the scales package or some other package?

Thanks!

# Packages 
library(ggplot2)
library(scales)

# Data 
data(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$am <- as.factor(mtcars$am)

# Good counts plot 
ggplot(data=mtcars, aes(x=am, fill=cyl)) +
  geom_bar(stat="count", position=position_dodge()) + scale_fill_grey() +
  ggtitle(expression(bold("mtcars")))  + xlab("automatic or manual") + ylab("count") + 
  theme(text=element_text(size=20)) + 
  theme(plot.title = element_text(size = 18, face = "bold"))

# Bad percentages plot 
ggplot(data=mtcars, aes(x=am, fill=cyl)) +
  geom_bar(aes(y=(..count..)/sum(..count..)), position=position_dodge()) + scale_fill_grey() +
  ggtitle(expression(bold("mtcars")))  + xlab("automatic or manual") + ylab("percentage") + 
  theme(text=element_text(size=20)) + 
  theme(plot.title = element_text(size = 18, face = "bold")) 

enter image description here

Upvotes: 1

Views: 406

Answers (2)

Allan Cameron
Allan Cameron

Reputation: 174338

If you want to do the whole thing inside ggplot2 (which is not always the easiest way), you could do:

ggplot(mtcars, aes(x = cyl, group = am, fill = cyl)) +
  geom_bar(aes(y = after_stat(prop), fill = factor(after_stat(x)))) + 
  scale_x_discrete(expand = c(0.5, 0.2)) +
  scale_y_continuous(labels = scales::percent) +
  scale_fill_manual(values = c("gray25", "gray50", "gray75"),
                    labels = levels(mtcars$cyl)) +
  facet_grid(.~am, switch = "x") +
  ggtitle(expression(bold("mtcars")))  + 
  labs(x = "automatic or manual", y = "percentage") + 
  theme(text                = element_text(size = 20),
        plot.title          = element_text(size = 18, face = "bold"),
        axis.text.x         = element_blank(),
        axis.ticks.length.x = unit(0, "points"),
        panel.spacing       = unit(0, "points"),
        strip.placement     = "outside",
        strip.background    = element_blank())

enter image description here

Upvotes: 1

Wolfgang Arnold
Wolfgang Arnold

Reputation: 1252

I'd only know to calculate the percentage per am manually (using tidyverse):

library(tidyverse)

pl_df <- mtcars %>%
    select(am, cyl) %>%     # we're only interested in am and cyl
    group_by(am, cyl) %>%   # group data and
    add_count(cyl) %>%      # add count of cylinders (per am)
    unique() %>%            # remove dupliceas
    ungroup() %>%           # remove grouping
    group_by(am) %>%        # group by am for... 
    mutate(cyl_percentage = n/sum(n)) %>%  # ...calculating percentage
    mutate(cyl = as.factor(cyl)) %>%       # change to factors so that ggplot treats...
    mutate(am = as.factor(am))             # ...am and cyl as discrete variables

ggplot(data = pl_df, aes(x = am, fill = cyl, y = cyl_percentage)) +
    geom_bar(stat = "identity", position=position_dodge()) + 
    scale_fill_grey() +
    ggtitle(expression(bold("mtcars"))) +
    xlab("automatic or manual") + 
    ylab("percentage") + 
    theme(text=element_text(size=20)) + 
    theme(plot.title = element_text(size = 18, face = "bold")) 

Upvotes: 2

Related Questions