Reputation: 618
I have a vector of data that can takes values between 1 and 100. When I plot a histogram with a bin size of 10, I would expect 10 bins with ranges of 1-10, 11-20, etc. Yet, I end up getting a plot that looks like this:
As you can see, the ranges of the outer bins go beyond the bounds for the values that my data can take (0 and 100).
Is there a way I can generate the histogram so that it has exactly n bins between a set range?
Upvotes: 0
Views: 1824
Reputation: 5747
You can do everything you want with the breaks
argument to geom_histogram
. You can set specific (and arbitrary) binwidths if that pleases you. The breaks
argument overrides the bins
and binwidth
arguments.
library(ggplot2)
set.seed(123)
x <- data.frame(x = sample(1:100, 1000, replace = TRUE))
ggplot(x) +
geom_histogram(aes(x), breaks = c(0, 13, 27, 45, 88, 100), color = "black") +
scale_x_continuous(breaks = c(0, 13, 27, 45, 88, 100))
If you want n equal bins in a specific range (say 0-100), use breaks = seq(0, 100, 100/n)
. This can be useful if you want to have a range that is wider than the data. For example, in my random sample, no value greater than 91 is present, but I know that 100 is a possible value, so my bin needs to extend to 100.
Upvotes: 1
Reputation: 5721
You can use a binned scale with geom_bar
ggplot(data.frame(v=sample(1:100, 100, TRUE)), aes(x=v)) +
geom_bar() +
scale_x_binned(n.breaks = 10)
example https://i.sstatic.net/Qvx7r.png
Upvotes: 1