Reputation: 155
Data:
data = data.frame(rnorm(250, 90, sd = 30))
I want to create a histogram where I have a bin of fixed width, but all observation which are bigger than arbitrary number
or lower than another arbitrary number
are group in their own bins. To take the above data as an example, I want binwidth = 10, but all values above 100 together in one bin and all values bellow 20 together in their own bin.
I looked at some answers, but they make no sense to me since they are mostly code. I would appreciate it greatly if somebody can explain the steps.
Upvotes: 0
Views: 1706
Reputation: 93811
The examples below show how to create the desired histogram in base graphics and with ggplot2
. Note that the resulting histogram will be quite distorted compared to one with a constant break size.
The R function hist
creates the histogram and allows us to set whatever bins we want using the breaks
argument:
# Fake data
set.seed(1049)
dat = data.frame(value=rnorm(250, 90, 30))
hist(dat$value, breaks=c(min(dat$value), seq(20,100,10), max(dat$value)))
In the code above c(min(dat$value), seq(20,100,10), max(dat$value))
sets breaks that start at the lowest data value and end at the highest data value. In between we use seq
to create a sequence of breaks that goes from 20 to 100 by increments of 10. Here's what the plot looks like:
library(ggplot2)
ggplot(dat, aes(value)) +
geom_histogram(breaks=c(min(dat$value), seq(20,100,10), max(dat$value)),
aes(y=..density..), color="grey30", fill=hcl(240,100,65)) +
theme_light()
Upvotes: 1