Reputation: 150
I have a stacked bar chart that looks like this:
While the colors look nice, it is confusing to have so many similar colors representing different drugs. I would like to have a separate color palette for each bar in the graph, for example, class1 could be use the palette "Blues" while class2 could use the palette "BuGn" (color palette names found here)
I have found some instances in which people manually coded colors for each bar (such as here), but I'm not sure if what I'm asking is possible - these bars would need to be based on palettes, since there are so many drugs in each drug class.
Code to create the above graph:
library(ggplot2)
library(plyr)
library(RColorBrewer)
drug_name <- c("a", "a", "b", "b", "b", "c", "d", "e", "e", "e", "e", "e", "e",
"f", "f", "g", "g", "g", "g", "h", "i", "j", "j", "j", "k", "k",
"k", "k", "k", "k", "l", "l", "m", "m", "m", "n", "o")
df <- data.frame(drug_name)
#get the frequency of each drug name
df_count <- count(df, 'drug_name')
#add a column that specifies the drug class
df_count$drug_class <- vector(mode='character', length=nrow(df_count))
df_count$drug_class[df_count$drug_name %in% c("a", "c", "e", "f")] <- 'class1'
df_count$drug_class[df_count$drug_name %in% c("b", "o")] <- 'class2'
df_count$drug_class[df_count$drug_name %in% c("d", "h", "i")] <- 'class3'
df_count$drug_class[df_count$drug_name %in% c("g", "j", "k", "l", "m", "n")] <- 'class4'
#expand color palette (from http://novyden.blogspot.com/2013/09/how-to-expand-color-palette-with-ggplot.html)
colorCount = length(unique(df_count$drug_name))
getPalette = colorRampPalette(brewer.pal(9, "Set1"))
test_plot <- ggplot(data = df_count, aes(x=drug_class, y=freq, fill=drug_name) ) + geom_bar(stat="identity") + scale_fill_manual(values=getPalette(colorCount))
test_plot
Upvotes: 6
Views: 8457
Reputation: 228
The various color palettes above do not consistently transfer to the different classes - instead they plot according to the named vector (a,b,c...) and thus are split across the various classes. See ??scale_fill_manual
for details.
In order to "match" them to each set of bars, we need to order the data.frame
by class, and align the color palettes appropriately with the names.
Create repeating palettes to test correct (expected) ordering.
repeating.pal = mapply(function(x,y) brewer.pal(x,y), ncol, c("Set2","Set2","Set2","Set2"))
repeating.pal[[2]] = repeating.pal[[2]][1:2] # We only need 2 colors but brewer.pal creates 3 minimum
repeating.pal = unname(unlist(repeating.pal))
Sort the data according to class (the order we want the colors to remain in!)
df_count_sorted <- df_count[order(df_count$drug_class),]
Copy the original ordering of the drug names.
df_count_sorted$labOrder <- df_count$drug_name
Add in test color palette.
df_count$colours<-repeating.pal
Alter the plot routine, with fill
= labOrder.
ggplot(data = df_sorted, aes(x=drug_class, y=freq, fill=labOrder) ) +
geom_bar(stat="identity", colour="black", lwd=0.2) +
geom_text(aes(label=paste0(drug_name,": ", freq), y=cum.freq), colour="grey20") +
scale_fill_manual(values=df_sorted$colours) +
guides(fill=FALSE)
Upvotes: 4
Reputation: 93851
With so many colors, your plot is going to be confusing. It's probably better to just label each bar section with the drug name and the count. The code below shows one way to make separate palettes for each bar and also how to label the bars.
First, add a column that we'll use for positioning the bar labels:
library(dplyr) # for the chaining (%>%) operator
## Add a column for positioning drug labels on graph
df_count = df_count %>% group_by(drug_class) %>%
mutate(cum.freq = cumsum(freq) - 0.5*freq)
Second, create the palettes. The code below uses four different Colorbrewer palettes, but you can use any combination of palette-creating functions or methods to control the colors as finely as you wish.
## Create separate palette for each drug class
# Count the number of colors we'll need for each bar
ncol = table(df_count$drug_class)
# Make the palettes
pal = mapply(function(x,y) brewer.pal(x,y), ncol, c("BrBG","OrRd","YlGn","Set2"))
pal[[2]] = pal[[2]][1:2] # We only need 2 colors but brewer.pal creates 3 minimum
pal = unname(unlist(pal)) # Combine palettes into single vector of colors
ggplot(data = df_count, aes(x=drug_class, y=freq, fill=drug_name) ) +
geom_bar(stat="identity", colour="black", lwd=0.2) +
geom_text(aes(label=paste0(drug_name,": ", freq), y=cum.freq), colour="grey20") +
scale_fill_manual(values=pal) +
guides(fill=FALSE)
There are many strategies and functions for creating color palettes. Here's another method, using the hcl
function:
lum = seq(100, 50, length.out=4) # Vary the luminance for each bar
shift = seq(20, 60, length.out=4) # Shift the hues for each bar
pal2 = mapply(function(n, l, s) hcl(seq(0 + s, 360 + s, length.out=n+1)[1:n], 100, l),
ncol, lum, shift)
pal2 = unname(unlist(pal2))
Upvotes: 7