Jake Thompson
Jake Thompson

Reputation: 2843

Remove baseline color for geom_histogram

I'm adding a color aesthetic to a faceted histogram. In the reprex below, with no color aesthetic, the histogram only show data within that facet level. However, with color defined, a baseline is added which stretches the stretches to include the range of data across all facets. Is there a way to make this not happen?

I'm looking for something similar to geom_density with trim = TRUE, but there doesn't appear to be a trim option for geom_histogram.

library(tidyverse)

data <- tibble(a = rchisq(1000, df = 3),
               b = rchisq(1000, df = 1),
               c = rchisq(1000, df = 10)) %>%
  gather()

ggplot(data, aes(x = value)) +
  geom_histogram() +
  facet_wrap(~ key, ncol = 1)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(data, aes(x = value)) +
  geom_histogram(color = "red") +
  facet_wrap(~ key, ncol = 1)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(data, aes(x = value)) +
  geom_density(color = "red", trim = TRUE) +
  facet_wrap(~ key, ncol = 1)

Created on 2019-07-20 by the reprex package (v0.3.0)

Upvotes: 5

Views: 1196

Answers (2)

Quinten
Quinten

Reputation: 41437

Another option could be using after_stat on your y aes with an ifelse to check if the mapped value is higher than 0 otherwise replace the value with NA which will make it possible to remove the baseline color like this:

library(tidyverse)
ggplot(data, aes(x = value, y = ifelse(after_stat(count) > 0, after_stat(count), NA))) +
  geom_histogram(color = "red") +
  facet_wrap(~ key, ncol = 1)

Created on 2023-02-15 with reprex v2.0.2

Upvotes: 2

Z.Lin
Z.Lin

Reputation: 29095

geom_histogram draws its bars using using rectGrob from the grid package, and a zero-width / zero-height rectGrob is depicted as a vertical / horizontal line in the outline colour, at least in my set-up for RStudio (& OP's as well, I presume). Demonstration below:

library(grid)

r1 <- rectGrob(width = unit(0, "npc"), gp = gpar(col = "red", fill = "grey")) # zero-width
r2 <- rectGrob(height = unit(0, "npc"), gp = gpar(col = "red", fill = "grey")) # zero-height

grid.draw(r1) # depicted as a vertical line, rather than disappear completely
grid.draw(r2) # depicted as a horizontal line, rather than disappear completely

demonstration

In this case, if we check the data frame associated with the histogram layer, there are many rows with ymin = ymax = 0, which are responsible for the 'baseline' effect seen in the question.

p <- ggplot(data, aes(x = value)) +
  geom_histogram(color = "red") +
  facet_wrap(~ key, ncol = 1)

View(layer_data(p) %>% filter(PANEL == 2)) # look at the data associated with facet panel 2

Workaround: Since the data calculations are done in StatBin's compute_group function, we can define an alternative version of the same function, with an additional step to drop the 0-count rows from the data frame completely:

# modified version of StatBin2 inherits from StatBin, except for an
# additional 2nd last line in compute_group() function
StatBin2 <- ggproto(
  "StatBin2", 
  StatBin,
  compute_group = function (data, scales, binwidth = NULL, bins = NULL, 
                            center = NULL, boundary = NULL, 
                            closed = c("right", "left"), pad = FALSE, 
                            breaks = NULL, origin = NULL, right = NULL, 
                            drop = NULL, width = NULL) {
    if (!is.null(breaks)) {
      if (!scales$x$is_discrete()) {
        breaks <- scales$x$transform(breaks)
      }
      bins <- ggplot2:::bin_breaks(breaks, closed)
    }
    else if (!is.null(binwidth)) {
      if (is.function(binwidth)) {
        binwidth <- binwidth(data$x)
      }
      bins <- ggplot2:::bin_breaks_width(scales$x$dimension(), binwidth, 
                                         center = center, boundary = boundary, 
                                         closed = closed)
    }
    else {
      bins <- ggplot2:::bin_breaks_bins(scales$x$dimension(), bins, 
                                        center = center, boundary = boundary, 
                                        closed = closed)
    }
    res <- ggplot2:::bin_vector(data$x, bins, weight = data$weight, pad = pad)

    # drop 0-count bins completely before returning the dataframe
    res <- res[res$count > 0, ] 

    res
  })

Usage:

ggplot(data, aes(x = value)) +
  geom_histogram(color = "red", stat = StatBin2) + # specify stat = StatBin2
  facet_wrap(~ key, ncol = 1)

result

Upvotes: 7

Related Questions