Nova
Nova

Reputation: 618

How to generate a histogram so that it has exactly n bins between a set range in ggplot2?

I have a vector of data that can takes values between 1 and 100. When I plot a histogram with a bin size of 10, I would expect 10 bins with ranges of 1-10, 11-20, etc. Yet, I end up getting a plot that looks like this:

enter image description here

As you can see, the ranges of the outer bins go beyond the bounds for the values that my data can take (0 and 100).

Is there a way I can generate the histogram so that it has exactly n bins between a set range?

Upvotes: 0

Views: 1824

Answers (2)

Ben Norris
Ben Norris

Reputation: 5747

You can do everything you want with the breaks argument to geom_histogram. You can set specific (and arbitrary) binwidths if that pleases you. The breaks argument overrides the bins and binwidth arguments.

library(ggplot2)
set.seed(123)
x <- data.frame(x = sample(1:100, 1000, replace = TRUE))
ggplot(x) +
  geom_histogram(aes(x), breaks = c(0, 13, 27, 45, 88, 100), color = "black") + 
  scale_x_continuous(breaks = c(0, 13, 27, 45, 88, 100))

histogram with arbitrary breaks

If you want n equal bins in a specific range (say 0-100), use breaks = seq(0, 100, 100/n). This can be useful if you want to have a range that is wider than the data. For example, in my random sample, no value greater than 91 is present, but I know that 100 is a possible value, so my bin needs to extend to 100.

Upvotes: 1

Ric
Ric

Reputation: 5721

You can use a binned scale with geom_bar

ggplot(data.frame(v=sample(1:100, 100, TRUE)), aes(x=v)) + 
geom_bar() +
scale_x_binned(n.breaks = 10)

example https://i.sstatic.net/Qvx7r.png

Upvotes: 1

Related Questions