R ggplot2 histogram bin allocation

Question

My problem is that when I construct histograms with ggplot2 of certain bin width greater than the resolution of the data, bins sometimes contain uneven numbers of increments from the underlying data. This results in large peaks in the histogram which five a false impression of how peaky the data are. Is there a built-in way to prevent this? Maybe allocate increments between bins?

require(ggplot2)
require(ggplot2movies)
m <- ggplot(movies, aes(x = rating))
#Original resolution
plot(m + geom_histogram(binwidth = 0.1) + scale_y_sqrt())
#Downsampled
plot(m + geom_histogram(binwidth = 0.25) + scale_y_sqrt())

Matt · Accepted Answer

Workaround for now is to simply modify binwidth as a function of data resolution, as opposed to number of bins.

R ggplot2 histogram bin allocation

Answers (2)

Related Questions