Cheeseburgler
Cheeseburgler

Reputation: 31

set binwidth to value for ggplot histogram

I'm having trouble creating a histogram using ggplot.

I have a data structure as follows:

value_1
112.45
2457.44
333.24

And this list of values continues for about 25000 more observations.

I want a histogram that has bins of the frequency of values 0-100 then 100-200 then 200-300 all up to the upper limit of values.

In the example above that would give 1 count in the bin 100-200, 1 count in the bin 300-400 and one count in the bin 2400-2500.

Could you help me in the right direction?

Upvotes: 1

Views: 6248

Answers (1)

mt1022
mt1022

Reputation: 17289

you can set the right bin width by setting the binwidth and either center or boundary at the same time:

df <- data.frame(x = c(112.45, 2457.44, 333.24))

library(ggplot2)  # 2.2.1
ggplot(df, aes(x)) + geom_histogram(binwidth = 100, center = 150)
# or
ggplot(df, aes(x)) + geom_histogram(binwidth = 100, boundary = 100)

center

The center of one of the bins. Note that if center is above or below the range of the data, things will be shifted by an appropriate number of widths. To center on integers, for example, use width = 1 and center = 0, even if 0 is outside the range of the data. At most one of center and boundary may be specified.

boundary

A boundary between two bins. As with center, things are shifted when boundary is outside the range of the data. For example, to center on integers, use width = 1 and boundary = 0.5, even if 0.5 is outside the range of the data. At most one of center and boundary may be specified.

If you known the range of the data, you can also set this manually with breaks = in geom_histogram only.

Upvotes: 3

Related Questions