Reputation: 275
I want to compare automatic and manual cars (mtcars dataset) using a bar plot (ggplot2).
I've got a plot that shows counts on the y-axis (left-hand plot below) but would instead want one with percentages on the y-axis.
I want this so that I can compare cars more easily and say, eg 'models with four cylinders make up x percent of automatic cars but only y percent of manual cars'.
I've tried using this scales
package tutorial https://www.tutorialspoint.com/how-to-create-a-bar-plot-using-ggplot2-with-percentage-on-y-axis-in-r for a neat way of changing counts to percentages (right-hand plot below).
The problems that the percentages add up across both the automatic and the manual cars. I want the percentages to add up within automatic and manual cars respectively.
Is there some way of doing that using the scales
package or some other package?
Thanks!
# Packages
library(ggplot2)
library(scales)
# Data
data(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$am <- as.factor(mtcars$am)
# Good counts plot
ggplot(data=mtcars, aes(x=am, fill=cyl)) +
geom_bar(stat="count", position=position_dodge()) + scale_fill_grey() +
ggtitle(expression(bold("mtcars"))) + xlab("automatic or manual") + ylab("count") +
theme(text=element_text(size=20)) +
theme(plot.title = element_text(size = 18, face = "bold"))
# Bad percentages plot
ggplot(data=mtcars, aes(x=am, fill=cyl)) +
geom_bar(aes(y=(..count..)/sum(..count..)), position=position_dodge()) + scale_fill_grey() +
ggtitle(expression(bold("mtcars"))) + xlab("automatic or manual") + ylab("percentage") +
theme(text=element_text(size=20)) +
theme(plot.title = element_text(size = 18, face = "bold"))
Upvotes: 1
Views: 406
Reputation: 174338
If you want to do the whole thing inside ggplot2
(which is not always the easiest way), you could do:
ggplot(mtcars, aes(x = cyl, group = am, fill = cyl)) +
geom_bar(aes(y = after_stat(prop), fill = factor(after_stat(x)))) +
scale_x_discrete(expand = c(0.5, 0.2)) +
scale_y_continuous(labels = scales::percent) +
scale_fill_manual(values = c("gray25", "gray50", "gray75"),
labels = levels(mtcars$cyl)) +
facet_grid(.~am, switch = "x") +
ggtitle(expression(bold("mtcars"))) +
labs(x = "automatic or manual", y = "percentage") +
theme(text = element_text(size = 20),
plot.title = element_text(size = 18, face = "bold"),
axis.text.x = element_blank(),
axis.ticks.length.x = unit(0, "points"),
panel.spacing = unit(0, "points"),
strip.placement = "outside",
strip.background = element_blank())
Upvotes: 1
Reputation: 1252
I'd only know to calculate the percentage per am
manually (using tidyverse):
library(tidyverse)
pl_df <- mtcars %>%
select(am, cyl) %>% # we're only interested in am and cyl
group_by(am, cyl) %>% # group data and
add_count(cyl) %>% # add count of cylinders (per am)
unique() %>% # remove dupliceas
ungroup() %>% # remove grouping
group_by(am) %>% # group by am for...
mutate(cyl_percentage = n/sum(n)) %>% # ...calculating percentage
mutate(cyl = as.factor(cyl)) %>% # change to factors so that ggplot treats...
mutate(am = as.factor(am)) # ...am and cyl as discrete variables
ggplot(data = pl_df, aes(x = am, fill = cyl, y = cyl_percentage)) +
geom_bar(stat = "identity", position=position_dodge()) +
scale_fill_grey() +
ggtitle(expression(bold("mtcars"))) +
xlab("automatic or manual") +
ylab("percentage") +
theme(text=element_text(size=20)) +
theme(plot.title = element_text(size = 18, face = "bold"))
Upvotes: 2