Luther_Blissett
Luther_Blissett

Reputation: 327

R & ggplot2 - how to plot relative frequency of a categorical split by a binary variable

I can easily make a relative frequency plot with one 'base' category along the x-axis and the frequency of another categorical being the y:

library(ggplot2)
ggplot(diamonds) +
  aes(x = cut, fill = color) +
  geom_bar(position = "fill")

Now say I have that categorical variable split in some way by a binary variable:

diamonds <- data.frame(diamonds)
diamonds$binary_dummy <- sample(c(0,1), nrow(diamonds), replace = T)

How do I plot the original categorical but now showing the split in the colour ('color') variable. Preferably this will be represented by two different shades of the original colour.

Basically I am trying to reproduce this plot: Freq_plot_example

As you can see from the legend, each catetory is split by "NonSyn"/"Syn" and each split is coloured as a dark/light shade of another distinct colour (e.g. "regulatory proteins NonSyn" = dark pink, "regulatory proteins Syn" = light pink).

Upvotes: 1

Views: 639

Answers (1)

teunbrand
teunbrand

Reputation: 38063

If you don't mind manually setting the palette you could do something like this:

library(ggplot2)
library(colorspace)

df <- data.frame(diamonds)
df$binary_dummy <- sample(c(0,1), nrow(df), replace = T)

pal <- scales::brewer_pal(palette = "Set1")(nlevels(df$color))
pal <- c(rbind(pal, darken(pal, amount = 0.2)))

ggplot(df, aes(x = cut, fill = interaction(binary_dummy, color))) +
  geom_bar(position = "fill") +
  scale_fill_manual(values = pal)

Created on 2020-04-14 by the reprex package (v0.3.0)

EDIT: To fix interaction-color relations you can set a named palette, e.g.:

pal <- setNames(pal, levels(interaction(df$binary_dummy, df$color)))

# Miss a level
df <- df[!(df$binary_dummy == 0 & df$color == "E"),]

ggplot(df, aes(x = cut, fill = interaction(binary_dummy, color))) +
  geom_bar(position = "fill") +
  scale_fill_manual(values = pal)

Alternatively, you can also set the breaks of the scale:

ggplot(df, aes(x = cut, fill = interaction(binary_dummy, color))) +
  geom_bar(position = "fill") +
  scale_fill_manual(values = pal, breaks = levels(interaction(df$binary_dummy, df$color)))

Upvotes: 2

Related Questions