Reputation: 391
I am trying to make a 2D histogram with the individual bins showing both the bin contents and a gradient. The data are integers ranging from 0 to 4 (only) in both axes.
I tried working with this answer but I end up with a few issues. First, a few bins end up getting no gradient at all. In the MWE below, the bottom left bins of 130 and 60 seems to be blank. Second, the bins are shifted to below 0 in both axes. For this axis issue, I found I could simply add a 0.5 to both x and y. In the end though, I also would like to have the axis labels to be centered within a bin and adding that 0.5 does not address that.
library(ggplot2)
# Construct the data to be plotted
x <- c(rep(0,190),rep(1,50),rep(2,10),rep(3,40))
y <- c(rep(0,130),rep(1,80),rep(2,30),rep(3,10),rep(4,40))
data <- data.frame(x,y)
# Taken from the example
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(binwidth=1) +
stat_bin2d(geom = "text", aes(label = ..count..), binwidth=1) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
xlim(-1, 5) +
ylim(-1, 5) +
coord_equal()
Is there something obvious I am doing wrong in both the color gradients and axis labels? I am also not married to ggplot or stat_bin2d if there is a better way to do it with some other package/command. Thanks in advance!
Upvotes: 3
Views: 1881
Reputation: 93761
stat_bin2d
uses the cut
function to create the bins. By default, cut
creates bins that are open on the left and closed on the right. stat_bin2d
also sets include.lowest=TRUE
so that the lowest interval will be closed on the left also. I haven't looked through the code for stat_bin2d
to try and figure out exactly what's going wrong, but it seems like it has to do with how the breaks
in cut
are being chosen. In any case, you can get the desired behavior by setting the bin breaks explicitly to start at -1. For example:
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(breaks=c(-1:4)) +
stat_bin2d(geom = "text", aes(label = ..count..), breaks=c(-1:4)) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
xlim(-1, 5) +
ylim(-1, 5) +
coord_equal()
To center the tiles on the integer lattice points, set the breaks to half-integer values:
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(breaks=seq(-0.5,4.5,1)) +
stat_bin2d(geom = "text", aes(label = ..count..), breaks=seq(-0.5,4.5,1)) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
scale_x_continuous(breaks=0:4, limits=c(-0.5,4.5)) +
scale_y_continuous(breaks=0:4, limits=c(-0.5,4.5)) +
coord_equal()
Or, to emphasize that the values are discrete, set the bins to be half a unit wide:
ggplot(data, aes(x = x, y = y)) +
geom_bin2d(breaks=seq(-0.25,4.25,0.5)) +
stat_bin2d(geom = "text", aes(label = ..count..), breaks=seq(-0.25,4.25,0.5)) +
scale_fill_gradient(low = "snow3", high = "red", trans = "log10") +
scale_x_continuous(breaks=0:4, limits=c(-0.25,4.25)) +
scale_y_continuous(breaks=0:4, limits=c(-0.25,4.25)) +
coord_equal()
Upvotes: 3