Reputation: 850
I have a continuous variable that goes between 0 and 1 that I have binned into unequal width bins (all the bins are equal sized except the last which combines values over a threshold). I would like to make a box plot where the width of the box covers the x-range of the bin. Here is a piece of code that makes two plots, one with equal width bins, and one with my binning.
require(ggplot2)
x<-runif(100,0,1)
y<-ifelse(x<0.3,2*x,0.75)+runif(100,0,.15)
xbin <- cut(x = x, breaks = seq(0,1,0.1),include.lowest = T,labels=seq(0.05,0.95,0.1) )
df<-data.frame(x=x,y=y,xbin=xbin)
ggplot(df,aes(x=xbin,y=y))+geom_boxplot()
xbin <- cut(x = x, breaks = c(seq(0,0.3,0.1),1),include.lowest = T,labels=c(seq(0.05,0.25,0.1),">3") )
df<-data.frame(x=x,y=y,xbin=xbin)
ggplot(df,aes(x=xbin,y=y))+geom_boxplot()
I would like the last box to take up the space of all the bins that were merged. I am afraid the that plot is misleading in that the last box covers a much larger x-range. The answer may be that there is a better way of presenting the data. My real data is slightly concentrated at 0 and 1 with fewer points around the 0.5, so I would like to bin the data (unlike the case in How to create geom_boxplot with large amount of continuous x-variables).
Thank you
Upvotes: 1
Views: 938
Reputation: 27732
like this?
ggplot( data = df, aes( x = x, y = y, colour = xbin ) ) + geom_boxplot()
of perhaps a violin-plot?
ggplot( data = df, aes( x = x, y = y, colour = xbin)) + geom_violin() + geom_point( alpha = 0.5 )
Upvotes: 1