Reputation: 1919
I've created a histogram/density plot function where I want the y axis to be count rather than density, but am having problems parameterizing its binwidth.
I am using examples based on http://docs.ggplot2.org/current/geom_histogram.html to illustrate my attempts.
Here's the successful plotMovies1 function. I followed the referenced url to make the y axis ..count.. instead of ..density.. Note that it uses a hardcoded .5 binwidth in two places, which is what I want to parameterize ...
# I want y axis as count, rather than density, and followed
# https://stat.ethz.ch/pipermail/r-help/2011-June/280588.html
plotMovies1 <- function() {
m <- ggplot(movies, aes(x = rating))
m <- m + geom_histogram(binwidth = .5)
m <- m + geom_density(aes(y = .5 * ..count..))
}
My first, failed naive attempt at parameterizing binwidth in a local bw in plotMovies2 ...
# Failed first attempt to parameterize binwidth
plotMovies2 <- function() {
bw <- .5
m <- ggplot(movies, aes(x = rating))
m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found
m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies2())
Error in eval(expr, envir, enclos) : object 'bw' not found
I see discussion about passing the local environment to aes in ggplot at https://github.com/hadley/ggplot2/issues/743, but plotMovies3 also fails in the same fashion, failing to find the bw object ...
# Failed second attempt to parameterize binwidth, even after establishing
# aes environment, per https://github.com/hadley/ggplot2/issues/743
plotMovies3 <- function() {
bw <- .5
m <- ggplot(movies, aes(x = rating), environment = environment())
m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found
m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies3())
Error in eval(expr, envir, enclos) : object 'bw' not found
I finally try setting a global, but it still fails to find the object ...
# Failed third attempt using global binwidth
global_bw <<- .5
plotMovies4 <- function() {
m <- ggplot(movies, aes(x = rating), environment = environment())
m <- m + geom_histogram(binwidth = global_bw)
# Error in eval(expr, envir, enclos) : object 'global_bw' not found
m <- m + geom_density(aes(y = global_bw * ..count..))
}
> print(plotMovies4())
Error in eval(expr, envir, enclos) : object 'global_bw' not found
Given plotMovies3 and plotMovies4, I am guessing it is not a straightforward environment issue. Can anyone shed any light on how I might resolve this? Again, my goal was to be able to create a histogram/density plot function where
Upvotes: 3
Views: 2611
Reputation: 10510
This is a follow-up on mts. It is intended as a long comment: first, the dataset is obtained by loading library("ggplot2movies")
. Secondly, it may be of interest to loop over several values of the binw
to produce a series of figures to be used together for, e.g. an animation. So what the code below does is simply to put mts's code into a loop for this purpose. A minor contribution indeed.
### Data
library("ggplot2movies")
### Histograms
ggplotMovieHistogram <- function(binw = 0.5) {
require('ggplot2movies')
p <- ggplot(movies, aes(x = rating)) +
geom_histogram(binwidth = binw)
wa <- density(x = movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
return(p)
}
ggsaveMovieHistogram <- function(binw = 0.5, file = 'test.pdf') {
pdf(file, width = 8, height = 8)
print(ggplotMovieHistogram(binw = binw))
dev.off()
}
for(i in seq(0.2, 0.8, by = 0.2)) {
ggsaveMovieHistogram(binw = i,
file = paste0('ggplot-barchart-loop-histogram-',
format(i, decimal.mark = '-'),
'.pdf'))
}
### Densities
library("ggplot2movies")
ggplotMovieDensity <- function(binw = 0.5) {
require('ggplot2movies')
p <- ggplot(movies, aes(x = rating)) +
geom_density(aes(y = 0.5 * ..count..))
wa <- density(x = movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
return(p)
}
ggsaveMovieDensity <- function(binw = 0.5, file = 'test.pdf') {
pdf(file, width = 8, height = 8)
print(ggplotMovieDensity(binw = binw))
dev.off()
}
for(i in seq(0.2, 0.8, by = 0.2)) {
ggsaveMovieDensity(binw = i,
file = paste0('ggplot-barchart-loop-density-',
format(i, decimal.mark = '-'),
'.pdf'))
}
Upvotes: 1
Reputation: 772
An alternative is the use of predefined bins with aes_string. Histograms then may be created by a loop with variable binwidths:
bins <<- list()
bins[["Variable1"]] <- 2
bins[["Variable2"]] <- 0.5
bins[["Variable3"]] <- 1
print(ggplot(movies, aes(x = rating))+
aes_string(x = "rating", y=paste("..density..*",bins[[i]],sep="")), na.rm=TRUE, position='dodge', binwidth=bins[[i]])
Upvotes: 1
Reputation: 2190
By no means beautiful but if you need a workaround you can use the regular density
function
plotMovies5 <- function(binw=0.5) {
m <- ggplot(movies, aes(x = rating))
m <- m + geom_histogram(binwidth = binw)
wa <- density(x=movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
}
print(plotMovies5(binw=0.25))
Note that you still have to do some tinkering with variables as the density estimates are not exactly equal as the following will show you:
binw = 0.5
m <- ggplot(movies, aes(x = rating))
m <- m + geom_density(aes(y = 0.5 * ..count..))
wa <- density(x=movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
m
Upvotes: 3