Sam
Sam

Reputation: 1363

Geom_rect colors not remaining constant in ggplot

I'm designing a shiny app that creates a graph based on quartiles for various types of data. The app works well, however, I'm noticing that the colors in the plots I am creating are not remaining constant, using geom_rect.

I've included code below that will generate two plots.

The first plot has a slightly "washed out" feel to it, with more pastel-like colors to it when compared to the second plot. Seeing as how I'm using the same process to create them, I'm uncertain of why the colors won't match. It's as if the alpha value has changed, or the color values themselves.

This only seems to happen when I have two categories or fewer in my shiny app, but I'm just beating my head against a wall trying to figure out what I'm doing incorrectly here. Any ideas on why the colors on these two graphs are different?

library('tidyverse')

df <- structure(list(grade = c(1L, 1L, 2L, 2L, 2L, 2L), 
                     benchmark = c("C","D", "B", "C", "D", "F"), 
                     count = c(22L, 15L, 32L, 168L, 117L, 41L), 
                     min = c(155, 169, 154, 160, 164, 178), 
                     q05 = c(163.1, 170.4,161.6, 164.3, 169.8, 179), 
                     q10 = c(165, 172.6, 165.2, 169, 172.6,180), 
                     q15 = c(165, 175.1, 167, 171.1, 176, 181), 
                     q20 = c(165, 175.8, 167.2, 173.4, 177.2, 182), 
                     q25 = c(165.2, 176, 169.5, 174.8, 180, 184), 
                     q30 = c(166, 176.4, 171, 176, 182, 184), 
                     q35 = c(166, 177.8, 171.8, 177, 183, 185), 
                     q40 = c(166.4, 178, 172, 179, 183, 186), 
                     q45 = c(167.4, 178.3, 172.9, 180, 185, 187), 
                     q50 = c(168, 179, 174.5, 181, 186, 188), 
                     q55 = c(171.3, 182.5, 176.1, 181.8, 187, 189), 
                     q60 = c(174.6, 184, 177, 183, 187, 190), 
                     q65 = c(175, 184.2, 177, 183.6, 188, 192), 
                     q70 = c(176.4, 185.6, 177.7, 185, 190, 192),
                     q75 = c(177, 187, 179, 185, 191, 194), 
                     q80 = c(177.8, 188.4, 180.6, 187, 191, 194), 
                     q85 = c(178.8, 189.8, 182.1, 188, 192.6, 195), 
                     q90 = c(186.2, 193, 186.7, 190, 194.4, 199), 
                     q95 = c(187, 196.8, 187.4, 192, 197, 201), 
                     max = c(194, 201, 188, 203, 210, 206)), 
                .Names = c("grade", "benchmark", "count", "min", "q05", 
                           "q10", "q15", "q20", "q25", "q30", "q35", "q40", "q45", "q50", 
                           "q55", "q60", "q65", "q70", "q75", "q80", "q85", "q90", "q95", 
                           "max"), 
                row.names = c(137L, 138L, 310L, 311L, 312L, 313L), 
                class = "data.frame")

#### Grade 1 Graph ########################################################################

# Sets up temporary data frame
temp <- df[df$grade == 1, ]

# Sets widths for geom_rect later.
for(i in seq(from = 1, to = nrow(temp), by = 1)){
  temp$xmin[i] <- i - 1 + .55
  temp$xmax[i] <- i + .45
}

ggplot(temp, aes(x = benchmark)) +
  geom_boxplot(aes(lower = q20, middle = q50, upper = q80, ymax = max, ymin = min), stat = 'identity') +
  scale_y_continuous(breaks = seq((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10), 10),
                     limits = c((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10))) +
  labs(x = 'Category', y = 'Values', title = 'Percentile Boxplots') +
  theme(axis.text = element_text(size = 12),
        axis.title = element_text(size = 14),
        title = element_text(size = 16)) +

  ## Geom_rect for Category C
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q20, ymax = temp[temp$benchmark == 'C', ]$q40),
            alpha = .15, fill = '#FFFF00') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q40, ymax = temp[temp$benchmark == 'C', ]$q60),
            alpha = .15, fill = '#92D050') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q60, ymax = temp[temp$benchmark == 'C', ]$q80),
            alpha = .15, fill = '#00B050') +

  ## Geom_rect for Category D
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q20, ymax = temp[temp$benchmark == 'D', ]$q40),
            alpha = .15, fill = '#FFFF00') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q40, ymax = temp[temp$benchmark == 'D', ]$q60),
            alpha = .15, fill = '#92D050') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q60, ymax = temp[temp$benchmark == 'D', ]$q80),
            alpha = .15, fill = '#00B050') +

  ## Geom_labels for quartiles.
  geom_label(aes(x = benchmark, y = q20, label = round(q20, 1)), fill = '#fdae61', size = 4) +
  geom_label(aes(x = benchmark, y = q80, label = round(q80, 1)), fill = '#a6d96a', size = 4) +
  geom_label(aes(x = benchmark, y = q50, label = round(q50, 1), fontface = 'bold'), fill = '#ffffbf', size = 5) +
  coord_flip() 

#### Grade 2 Graph ####

temp <- df[df$grade == 2, ]

for(i in seq(from = 1, to = nrow(temp), by = 1)){
  temp$xmin[i] <- i - 1 + .55
  temp$xmax[i] <- i + .45
}

ggplot(temp, aes(x = benchmark)) +
  geom_boxplot(aes(lower = q20, middle = q50, upper = q80, ymax = max, ymin = min), stat = 'identity') +
  scale_y_continuous(breaks = seq((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10), 10),
                     limits = c((min(temp$min)%/%10 * 10), (max(temp$max)%/%10 * 10 + 10))) +
  labs(x = 'Category', y = 'Values', title = 'Percentile Boxplots') +
  theme(axis.text = element_text(size = 12),
        axis.title = element_text(size = 14),
        title = element_text(size = 16)) +

  ## Geom_rect for Category B
  geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, xmax = temp[temp$benchmark == 'B', ]$xmax,
                ymin = temp[temp$benchmark == 'B', ]$q20, ymax = temp[temp$benchmark == 'B', ]$q40),
            alpha = .15, fill = '#FFFF00') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, xmax = temp[temp$benchmark == 'B', ]$xmax,
                ymin = temp[temp$benchmark == 'B', ]$q40, ymax = temp[temp$benchmark == 'B', ]$q60),
            alpha = .15, fill = '#92D050') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, xmax = temp[temp$benchmark == 'B', ]$xmax,
                ymin = temp[temp$benchmark == 'B', ]$q60, ymax = temp[temp$benchmark == 'B', ]$q80),
            alpha = .15, fill = '#00B050') +

  ## Geom_rect for Category C
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q20, ymax = temp[temp$benchmark == 'C', ]$q40),
            alpha = .15, fill = '#FFFF00') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q40, ymax = temp[temp$benchmark == 'C', ]$q60),
            alpha = .15, fill = '#92D050') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q60, ymax = temp[temp$benchmark == 'C', ]$q80),
            alpha = .15, fill = '#00B050') +

  ## Geom_rect for Category D
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q20, ymax = temp[temp$benchmark == 'D', ]$q40),
            alpha = .15, fill = '#FFFF00') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q40, ymax = temp[temp$benchmark == 'D', ]$q60),
            alpha = .15, fill = '#92D050') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q60, ymax = temp[temp$benchmark == 'D', ]$q80),
            alpha = .15, fill = '#00B050') +

  ## Geom_rect for Category F
  geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, xmax = temp[temp$benchmark == 'F', ]$xmax,
                ymin = temp[temp$benchmark == 'F', ]$q20, ymax = temp[temp$benchmark == 'F', ]$q40),
            alpha = .15, fill = '#FFFF00') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, xmax = temp[temp$benchmark == 'F', ]$xmax,
                ymin = temp[temp$benchmark == 'F', ]$q40, ymax = temp[temp$benchmark == 'F', ]$q60),
            alpha = .15, fill = '#92D050') +
  geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, xmax = temp[temp$benchmark == 'F', ]$xmax,
                ymin = temp[temp$benchmark == 'F', ]$q60, ymax = temp[temp$benchmark == 'F', ]$q80),
            alpha = .15, fill = '#00B050') +

  ## Geom_labels for quartiles.
  geom_label(aes(x = benchmark, y = q20, label = round(q20, 1)), fill = '#fdae61', size = 4) +
  geom_label(aes(x = benchmark, y = q80, label = round(q80, 1)), fill = '#a6d96a', size = 4) +
  geom_label(aes(x = benchmark, y = q50, label = round(q50, 1), fontface = 'bold'), fill = '#ffffbf', size = 5) +
  coord_flip() 

Upvotes: 1

Views: 818

Answers (1)

Z.Lin
Z.Lin

Reputation: 29095

Each of your geom_rect() calls for a specific category actually created multiple rectangles overlapping one another, each with alpha = .15, so the overall colour became more intense.

Instead of:

# temp here is based on the second plot, with 4 categories
p.overlay <- ggplot(temp) +

  ## Geom_rect for Category B
  geom_rect(aes(xmin = temp[temp$benchmark == 'B', ]$xmin, 
                xmax = temp[temp$benchmark == 'B', ]$xmax,
                ymin = temp[temp$benchmark == 'B', ]$q20, 
                ymax = temp[temp$benchmark == 'B', ]$q40),
            alpha = .15, fill = '#FFFF00') +

  ## Geom_rect for Category C
  geom_rect(aes(xmin = temp[temp$benchmark == 'C', ]$xmin, 
                xmax = temp[temp$benchmark == 'C', ]$xmax,
                ymin = temp[temp$benchmark == 'C', ]$q20, 
                ymax = temp[temp$benchmark == 'C', ]$q40),
            alpha = .15, fill = '#FFFF00') +

  ## Geom_rect for Category D
  geom_rect(aes(xmin = temp[temp$benchmark == 'D', ]$xmin, 
                xmax = temp[temp$benchmark == 'D', ]$xmax,
                ymin = temp[temp$benchmark == 'D', ]$q20, 
                ymax = temp[temp$benchmark == 'D', ]$q40),
            alpha = .15, fill = '#FFFF00') +

  ## Geom_rect for Category F
  geom_rect(aes(xmin = temp[temp$benchmark == 'F', ]$xmin, 
                xmax = temp[temp$benchmark == 'F', ]$xmax,
                ymin = temp[temp$benchmark == 'F', ]$q20, 
                ymax = temp[temp$benchmark == 'F', ]$q40),
            alpha = .15, fill = '#FFFF00')

Try:

p.single <- ggplot(temp) +

  geom_rect(aes(xmin = xmin, xmax = xmax, ymin = q20, ymax = q40),
            alpha = .15, fill = "#FFFF00")

Compare results:

cowplot::plot_grid(p.overlay, p.single, labels = c("Overlay", "Single"))

comparison plot

Explanation

If we look at the structures of p.overlay vs. p.single, we can see that each geom_rect() created a separate layer:

> length(p.overlay$layers)
[1] 4
> length(p.single$layers)
[1] 1

layer_data() returns the data associated with each specific layer, and we can see that each rectangle layer in p.overlay is actually associated with four identical overlapping rectangles (for the same category), while the rectangle layer in p.single is associated with four different rectangles, each for a different category:

> lapply(1:4, function(i) layer_data(p.overlay, i))
[[1]]
  xmin xmax  ymin ymax PANEL group colour    fill size linetype alpha
1 0.55 1.45 167.2  172     1    -1     NA #FFFF00  0.5        1  0.15
2 0.55 1.45 167.2  172     1    -1     NA #FFFF00  0.5        1  0.15
3 0.55 1.45 167.2  172     1    -1     NA #FFFF00  0.5        1  0.15
4 0.55 1.45 167.2  172     1    -1     NA #FFFF00  0.5        1  0.15

[[2]]
  xmin xmax  ymin ymax PANEL group colour    fill size linetype alpha
1 1.55 2.45 173.4  179     1    -1     NA #FFFF00  0.5        1  0.15
2 1.55 2.45 173.4  179     1    -1     NA #FFFF00  0.5        1  0.15
3 1.55 2.45 173.4  179     1    -1     NA #FFFF00  0.5        1  0.15
4 1.55 2.45 173.4  179     1    -1     NA #FFFF00  0.5        1  0.15

[[3]]
  xmin xmax  ymin ymax PANEL group colour    fill size linetype alpha
1 2.55 3.45 177.2  183     1    -1     NA #FFFF00  0.5        1  0.15
2 2.55 3.45 177.2  183     1    -1     NA #FFFF00  0.5        1  0.15
3 2.55 3.45 177.2  183     1    -1     NA #FFFF00  0.5        1  0.15
4 2.55 3.45 177.2  183     1    -1     NA #FFFF00  0.5        1  0.15

[[4]]
  xmin xmax ymin ymax PANEL group colour    fill size linetype alpha
1 3.55 4.45  182  186     1    -1     NA #FFFF00  0.5        1  0.15
2 3.55 4.45  182  186     1    -1     NA #FFFF00  0.5        1  0.15
3 3.55 4.45  182  186     1    -1     NA #FFFF00  0.5        1  0.15
4 3.55 4.45  182  186     1    -1     NA #FFFF00  0.5        1  0.15

> layer_data(p.single, i = 1)
  xmin xmax  ymin ymax PANEL group colour    fill size linetype alpha
1 0.55 1.45 167.2  172     1    -1     NA #FFFF00  0.5        1  0.15
2 1.55 2.45 173.4  179     1    -1     NA #FFFF00  0.5        1  0.15
3 2.55 3.45 177.2  183     1    -1     NA #FFFF00  0.5        1  0.15
4 3.55 4.45 182.0  186     1    -1     NA #FFFF00  0.5        1  0.15

Why was everything repeated four times? Because the top level ggplot() call specified temp as the data source for all subsequent geoms to inherit by default, and it has four rows. If we had used the temp data frame generated for the first plot instead, everything would have been repeated two times.

To avoid this phenomenon, I recommend adopting the approach demonstrated in p.single above, & use one geom_rect() to specify fill colours for all categories. The code is shorter, cleaner, and more flexible to changes in category number / labels.

If you have strong reasons to define separate geom_rect() for each category, don't specify any data frame in the top level ggplot() call. Based on your original code, only geom_boxplot() uses it anyway, so you can specify data = temp there instead.

(The above demonstration is for the q20-q40 rectangles, but the same principle applies for the rest.)

Upvotes: 1

Related Questions