Reputation: 1363
I'm designing a shiny app that creates a graph based on quartiles for various types of data. The app works well, however, I'm noticing that the colors in the plots I am creating are not remaining constant, using geom_rect
.
I've included code below that will generate two plots.
The first plot has a slightly "washed out" feel to it, with more pastel-like colors to it when compared to the second plot. Seeing as how I'm using the same process to create them, I'm uncertain of why the colors won't match. It's as if the alpha value has changed, or the color values themselves.
This only seems to happen when I have two categories or fewer in my shiny app, but I'm just beating my head against a wall trying to figure out what I'm doing incorrectly here. Any ideas on why the colors on these two graphs are different?
library('tidyverse')
df <- structure(list(grade = c(1L, 1L, 2L, 2L, 2L, 2L),
benchmark = c("C","D", "B", "C", "D", "F"),
count = c(22L, 15L, 32L, 168L, 117L, 41L),
min = c(155, 169, 154, 160, 164, 178),
q05 = c(163.1, 170.4,161.6, 164.3, 169.8, 179),
q10 = c(165, 172.6, 165.2, 169, 172.6,180),
q15 = c(165, 175.1, 167, 171.1, 176, 181),
q20 = c(165, 175.8, 167.2, 173.4, 177.2, 182),
q25 = c(165.2, 176, 169.5, 174.8, 180, 184),
q30 = c(166, 176.4, 171, 176, 182, 184),
q35 = c(166, 177.8, 171.8, 177, 183, 185),
q40 = c(166.4, 178, 172, 179, 183, 186),
q45 = c(167.4, 178.3, 172.9, 180, 185, 187),
q50 = c(168, 179, 174.5, 181, 186, 188),
q55 = c(171.3, 182.5, 176.1, 181.8, 187, 189),
q60 = c(174.6, 184, 177, 183, 187, 190),
q65 = c(175, 184.2, 177, 183.6, 188, 192),
q70 = c(176.4, 185.6, 177.7, 185, 190, 192),
q75 = c(177, 187, 179, 185, 191, 194),
q80 = c(177.8, 188.4, 180.6, 187, 191, 194),
q85 = c(178.8, 189.8, 182.1, 188, 192.6, 195),
q90 = c(186.2, 193, 186.7, 190, 194.4, 199),
q95 = c(187, 196.8, 187.4, 192, 197, 201),
max = c(194, 201, 188, 203, 210, 206)),
.Names = c("grade", "benchmark", "count", "min", "q05",
"q10", "q15", "q20", "q25", "q30", "q35", "q40", "q45", "q50",
"q55", "q60", "q65", "q70", "q75", "q80", "q85", "q90", "q95",
"max"),
row.names = c(137L, 138L, 310L, 311L, 312L, 313L),
class = "data.frame")
#### Grade 1 Graph ########################################################################
# Sets up temporary data frame
temp <- df[df$grade == 1, ]
# Sets widths for geom_rect later.
for(i in seq(from = 1, to = nrow(temp), by = 1)){
temp$xmin[i] <- i - 1 + .55
temp$xmax[i] <- i + .45
}
ggplot(temp, aes(x = benchmark)) +
geom_boxplot(aes(lower = q20, middle = q50, upper = q80, ymax = max, ymin = min), stat = 'identity') +
scale_y_continuous(breaks = seq((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10), 10),
limits = c((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10))) +
labs(x = 'Category', y = 'Values', title = 'Percentile Boxplots') +
theme(axis.text = element_text(size = 12),
axis.title = element_text(size = 14),
title = element_text(size = 16)) +
## Geom_rect for Category C
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q20, ymax = temp[temp$benchmark == 'C', ]$q40),
alpha = .15, fill = '#FFFF00') +
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q40, ymax = temp[temp$benchmark == 'C', ]$q60),
alpha = .15, fill = '#92D050') +
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q60, ymax = temp[temp$benchmark == 'C', ]$q80),
alpha = .15, fill = '#00B050') +
## Geom_rect for Category D
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q20, ymax = temp[temp$benchmark == 'D', ]$q40),
alpha = .15, fill = '#FFFF00') +
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q40, ymax = temp[temp$benchmark == 'D', ]$q60),
alpha = .15, fill = '#92D050') +
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q60, ymax = temp[temp$benchmark == 'D', ]$q80),
alpha = .15, fill = '#00B050') +
## Geom_labels for quartiles.
geom_label(aes(x = benchmark, y = q20, label = round(q20, 1)), fill = '#fdae61', size = 4) +
geom_label(aes(x = benchmark, y = q80, label = round(q80, 1)), fill = '#a6d96a', size = 4) +
geom_label(aes(x = benchmark, y = q50, label = round(q50, 1), fontface = 'bold'), fill = '#ffffbf', size = 5) +
coord_flip()
#### Grade 2 Graph ####
temp <- df[df$grade == 2, ]
for(i in seq(from = 1, to = nrow(temp), by = 1)){
temp$xmin[i] <- i - 1 + .55
temp$xmax[i] <- i + .45
}
ggplot(temp, aes(x = benchmark)) +
geom_boxplot(aes(lower = q20, middle = q50, upper = q80, ymax = max, ymin = min), stat = 'identity') +
scale_y_continuous(breaks = seq((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10), 10),
limits = c((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10))) +
labs(x = 'Category', y = 'Values', title = 'Percentile Boxplots') +
theme(axis.text = element_text(size = 12),
axis.title = element_text(size = 14),
title = element_text(size = 16)) +
## Geom_rect for Category B
geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, xmax = temp[temp$benchmark == 'B', ]$xmax,
ymin = temp[temp$benchmark == 'B', ]$q20, ymax = temp[temp$benchmark == 'B', ]$q40),
alpha = .15, fill = '#FFFF00') +
geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, xmax = temp[temp$benchmark == 'B', ]$xmax,
ymin = temp[temp$benchmark == 'B', ]$q40, ymax = temp[temp$benchmark == 'B', ]$q60),
alpha = .15, fill = '#92D050') +
geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, xmax = temp[temp$benchmark == 'B', ]$xmax,
ymin = temp[temp$benchmark == 'B', ]$q60, ymax = temp[temp$benchmark == 'B', ]$q80),
alpha = .15, fill = '#00B050') +
## Geom_rect for Category C
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q20, ymax = temp[temp$benchmark == 'C', ]$q40),
alpha = .15, fill = '#FFFF00') +
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q40, ymax = temp[temp$benchmark == 'C', ]$q60),
alpha = .15, fill = '#92D050') +
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q60, ymax = temp[temp$benchmark == 'C', ]$q80),
alpha = .15, fill = '#00B050') +
## Geom_rect for Category D
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q20, ymax = temp[temp$benchmark == 'D', ]$q40),
alpha = .15, fill = '#FFFF00') +
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q40, ymax = temp[temp$benchmark == 'D', ]$q60),
alpha = .15, fill = '#92D050') +
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q60, ymax = temp[temp$benchmark == 'D', ]$q80),
alpha = .15, fill = '#00B050') +
## Geom_rect for Category F
geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, xmax = temp[temp$benchmark == 'F', ]$xmax,
ymin = temp[temp$benchmark == 'F', ]$q20, ymax = temp[temp$benchmark == 'F', ]$q40),
alpha = .15, fill = '#FFFF00') +
geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, xmax = temp[temp$benchmark == 'F', ]$xmax,
ymin = temp[temp$benchmark == 'F', ]$q40, ymax = temp[temp$benchmark == 'F', ]$q60),
alpha = .15, fill = '#92D050') +
geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, xmax = temp[temp$benchmark == 'F', ]$xmax,
ymin = temp[temp$benchmark == 'F', ]$q60, ymax = temp[temp$benchmark == 'F', ]$q80),
alpha = .15, fill = '#00B050') +
## Geom_labels for quartiles.
geom_label(aes(x = benchmark, y = q20, label = round(q20, 1)), fill = '#fdae61', size = 4) +
geom_label(aes(x = benchmark, y = q80, label = round(q80, 1)), fill = '#a6d96a', size = 4) +
geom_label(aes(x = benchmark, y = q50, label = round(q50, 1), fontface = 'bold'), fill = '#ffffbf', size = 5) +
coord_flip()
Upvotes: 1
Views: 818
Reputation: 29095
Each of your geom_rect()
calls for a specific category actually created multiple rectangles overlapping one another, each with alpha = .15
, so the overall colour became more intense.
Instead of:
# temp here is based on the second plot, with 4 categories
p.overlay <- ggplot(temp) +
## Geom_rect for Category B
geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin,
xmax = temp[temp$benchmark == 'B', ]$xmax,
ymin = temp[temp$benchmark == 'B', ]$q20,
ymax = temp[temp$benchmark == 'B', ]$q40),
alpha = .15, fill = '#FFFF00') +
## Geom_rect for Category C
geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin,
xmax = temp[temp$benchmark == 'C', ]$xmax,
ymin = temp[temp$benchmark == 'C', ]$q20,
ymax = temp[temp$benchmark == 'C', ]$q40),
alpha = .15, fill = '#FFFF00') +
## Geom_rect for Category D
geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin,
xmax = temp[temp$benchmark == 'D', ]$xmax,
ymin = temp[temp$benchmark == 'D', ]$q20,
ymax = temp[temp$benchmark == 'D', ]$q40),
alpha = .15, fill = '#FFFF00') +
## Geom_rect for Category F
geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin,
xmax = temp[temp$benchmark == 'F', ]$xmax,
ymin = temp[temp$benchmark == 'F', ]$q20,
ymax = temp[temp$benchmark == 'F', ]$q40),
alpha = .15, fill = '#FFFF00')
Try:
p.single <- ggplot(temp) +
geom_rect(aes(xmin = xmin, xmax = xmax, ymin = q20, ymax = q40),
alpha = .15, fill = "#FFFF00")
Compare results:
cowplot::plot_grid(p.overlay, p.single, labels = c("Overlay", "Single"))
Explanation
If we look at the structures of p.overlay
vs. p.single
, we can see that each geom_rect()
created a separate layer:
> length(p.overlay$layers)
[1] 4
> length(p.single$layers)
[1] 1
layer_data()
returns the data associated with each specific layer, and we can see that each rectangle layer in p.overlay
is actually associated with four identical overlapping rectangles (for the same category), while the rectangle layer in p.single
is associated with four different rectangles, each for a different category:
> lapply(1:4, function(i) layer_data(p.overlay, i))
[[1]]
xmin xmax ymin ymax PANEL group colour fill size linetype alpha
1 0.55 1.45 167.2 172 1 -1 NA #FFFF00 0.5 1 0.15
2 0.55 1.45 167.2 172 1 -1 NA #FFFF00 0.5 1 0.15
3 0.55 1.45 167.2 172 1 -1 NA #FFFF00 0.5 1 0.15
4 0.55 1.45 167.2 172 1 -1 NA #FFFF00 0.5 1 0.15
[[2]]
xmin xmax ymin ymax PANEL group colour fill size linetype alpha
1 1.55 2.45 173.4 179 1 -1 NA #FFFF00 0.5 1 0.15
2 1.55 2.45 173.4 179 1 -1 NA #FFFF00 0.5 1 0.15
3 1.55 2.45 173.4 179 1 -1 NA #FFFF00 0.5 1 0.15
4 1.55 2.45 173.4 179 1 -1 NA #FFFF00 0.5 1 0.15
[[3]]
xmin xmax ymin ymax PANEL group colour fill size linetype alpha
1 2.55 3.45 177.2 183 1 -1 NA #FFFF00 0.5 1 0.15
2 2.55 3.45 177.2 183 1 -1 NA #FFFF00 0.5 1 0.15
3 2.55 3.45 177.2 183 1 -1 NA #FFFF00 0.5 1 0.15
4 2.55 3.45 177.2 183 1 -1 NA #FFFF00 0.5 1 0.15
[[4]]
xmin xmax ymin ymax PANEL group colour fill size linetype alpha
1 3.55 4.45 182 186 1 -1 NA #FFFF00 0.5 1 0.15
2 3.55 4.45 182 186 1 -1 NA #FFFF00 0.5 1 0.15
3 3.55 4.45 182 186 1 -1 NA #FFFF00 0.5 1 0.15
4 3.55 4.45 182 186 1 -1 NA #FFFF00 0.5 1 0.15
> layer_data(p.single, i = 1)
xmin xmax ymin ymax PANEL group colour fill size linetype alpha
1 0.55 1.45 167.2 172 1 -1 NA #FFFF00 0.5 1 0.15
2 1.55 2.45 173.4 179 1 -1 NA #FFFF00 0.5 1 0.15
3 2.55 3.45 177.2 183 1 -1 NA #FFFF00 0.5 1 0.15
4 3.55 4.45 182.0 186 1 -1 NA #FFFF00 0.5 1 0.15
Why was everything repeated four times? Because the top level ggplot()
call specified temp
as the data source for all subsequent geoms to inherit by default, and it has four rows. If we had used the temp
data frame generated for the first plot instead, everything would have been repeated two times.
To avoid this phenomenon, I recommend adopting the approach demonstrated in p.single
above, & use one geom_rect()
to specify fill colours for all categories. The code is shorter, cleaner, and more flexible to changes in category number / labels.
If you have strong reasons to define separate geom_rect()
for each category, don't specify any data frame in the top level ggplot()
call. Based on your original code, only geom_boxplot()
uses it anyway, so you can specify data = temp
there instead.
(The above demonstration is for the q20-q40 rectangles, but the same principle applies for the rest.)
Upvotes: 1