Ken Lin
Ken Lin

Reputation: 1919

Parameterized ggplot2 histogram/density aes function cannot find object

I've created a histogram/density plot function where I want the y axis to be count rather than density, but am having problems parameterizing its binwidth.

I am using examples based on http://docs.ggplot2.org/current/geom_histogram.html to illustrate my attempts.

Here's the successful plotMovies1 function. I followed the referenced url to make the y axis ..count.. instead of ..density.. Note that it uses a hardcoded .5 binwidth in two places, which is what I want to parameterize ...

# I want y axis as count, rather than density, and followed
# https://stat.ethz.ch/pipermail/r-help/2011-June/280588.html
plotMovies1 <- function() {
  m <- ggplot(movies, aes(x = rating))
  m <- m + geom_histogram(binwidth = .5)
  m <- m + geom_density(aes(y = .5 * ..count..))
}

histogram/density with count as y axis and hardcoded binwidth

My first, failed naive attempt at parameterizing binwidth in a local bw in plotMovies2 ...

# Failed first attempt to parameterize binwidth
plotMovies2 <- function() {
  bw <- .5
  m <- ggplot(movies, aes(x = rating))
  m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found 
  m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies2())
Error in eval(expr, envir, enclos) : object 'bw' not found

I see discussion about passing the local environment to aes in ggplot at https://github.com/hadley/ggplot2/issues/743, but plotMovies3 also fails in the same fashion, failing to find the bw object ...

# Failed second attempt to parameterize binwidth, even after establishing
# aes environment, per https://github.com/hadley/ggplot2/issues/743
plotMovies3 <- function() {
  bw <- .5
  m <- ggplot(movies, aes(x = rating), environment = environment())
  m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found 
  m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies3())
Error in eval(expr, envir, enclos) : object 'bw' not found

I finally try setting a global, but it still fails to find the object ...

# Failed third attempt using global binwidth
global_bw <<- .5
plotMovies4 <- function() {
  m <- ggplot(movies, aes(x = rating), environment = environment())
  m <- m + geom_histogram(binwidth = global_bw)
# Error in eval(expr, envir, enclos) : object 'global_bw' not found 
  m <- m + geom_density(aes(y = global_bw * ..count..))
}
> print(plotMovies4())
Error in eval(expr, envir, enclos) : object 'global_bw' not found

Given plotMovies3 and plotMovies4, I am guessing it is not a straightforward environment issue. Can anyone shed any light on how I might resolve this? Again, my goal was to be able to create a histogram/density plot function where

  1. Its y axis is count rather than density, and
  2. Its binwidth could be parameterized (e.g., for manipulate)

Upvotes: 3

Views: 2611

Answers (3)

PatrickT
PatrickT

Reputation: 10510

This is a follow-up on mts. It is intended as a long comment: first, the dataset is obtained by loading library("ggplot2movies"). Secondly, it may be of interest to loop over several values of the binw to produce a series of figures to be used together for, e.g. an animation. So what the code below does is simply to put mts's code into a loop for this purpose. A minor contribution indeed.

    ### Data
    library("ggplot2movies")

    ### Histograms
    ggplotMovieHistogram <- function(binw = 0.5) {
        require('ggplot2movies')
        p <- ggplot(movies, aes(x = rating)) + 
            geom_histogram(binwidth = binw)
        wa <- density(x = movies$rating, bw = binw)
        wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
        p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
        return(p)
    }

    ggsaveMovieHistogram <- function(binw = 0.5, file = 'test.pdf') {
        pdf(file, width = 8, height = 8)
            print(ggplotMovieHistogram(binw = binw))
        dev.off()
    }

    for(i in seq(0.2, 0.8, by = 0.2)) {
        ggsaveMovieHistogram(binw = i, 
                    file = paste0('ggplot-barchart-loop-histogram-', 
                                  format(i, decimal.mark = '-'), 
                                  '.pdf'))
    }


    ### Densities
    library("ggplot2movies")
    ggplotMovieDensity <- function(binw = 0.5) {
        require('ggplot2movies')
        p <- ggplot(movies, aes(x = rating)) + 
            geom_density(aes(y = 0.5 * ..count..))
        wa <- density(x = movies$rating, bw = binw)
        wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
        p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
        return(p)
    }

    ggsaveMovieDensity <- function(binw = 0.5, file = 'test.pdf') {
        pdf(file, width = 8, height = 8)
            print(ggplotMovieDensity(binw = binw))
        dev.off()
    }

    for(i in seq(0.2, 0.8, by = 0.2)) {
        ggsaveMovieDensity(binw = i, 
                    file = paste0('ggplot-barchart-loop-density-', 
                                  format(i, decimal.mark = '-'), 
                                  '.pdf'))
    }

Upvotes: 1

yPennylane
yPennylane

Reputation: 772

An alternative is the use of predefined bins with aes_string. Histograms then may be created by a loop with variable binwidths:

bins <<- list()
bins[["Variable1"]] <- 2
bins[["Variable2"]] <- 0.5
bins[["Variable3"]] <- 1
print(ggplot(movies, aes(x = rating))+
aes_string(x = "rating", y=paste("..density..*",bins[[i]],sep="")), na.rm=TRUE, position='dodge', binwidth=bins[[i]])

Upvotes: 1

mts
mts

Reputation: 2190

By no means beautiful but if you need a workaround you can use the regular density function

plotMovies5 <- function(binw=0.5) {
  m <- ggplot(movies, aes(x = rating))
  m <- m + geom_histogram(binwidth = binw)
  wa <- density(x=movies$rating, bw = binw)
  wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
  m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
}
print(plotMovies5(binw=0.25))

Note that you still have to do some tinkering with variables as the density estimates are not exactly equal as the following will show you:

binw = 0.5
m <- ggplot(movies, aes(x = rating))
m <- m + geom_density(aes(y = 0.5 * ..count..))
wa <- density(x=movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
m

Upvotes: 3

Related Questions