Reputation: 534
I have a 2D histogram that I generate with numpy:
H, xedges, yedges = np.histogram2d(y, x, weights=mass * (1.0 - pf),
bins=(yrange,xrange))
Note that I'm currently weighing the bins with a function of mass (mass
is a numpy array with the same dimensions as x
and y
). The bins are logarithmic (generated via xrange = np.logspace(minX, maxX, 100)
).
However, I really want to weight the bins by the mass function but normalize them (i.e. divide) each by the area of the bin: e.g. - each bin has area xrange[i] * yrange[i]
. However, since xrange
and yrange
don't have the same dimensions as mass
, x
and y
... I can't simply put the normalization in the np.histogram2d
call.
How can I normalize the bin counts by the area in each log bin?
For reference, here's the plot (I've added x and y 1D histograms that I'll also need to normalize by the width of the bin, but once I figure out how to do it for 2D it should be analogous).
FYI - I generate the main (and axes-histograms) with matplotlib:
X,Y=np.meshgrid(xrange,yrange)
H = np.log10(H)
masked_array = np.ma.array(H, mask=np.isnan(H)) # mask out all nan, i.e. log10(0.0)
cax = (ax2dhist.pcolormesh(X,Y,masked_array, cmap=cmap, norm=LogNorm(vmin=1,vmax=8)))
Upvotes: 2
Views: 2659
Reputation: 74152
I think you just want to pass normed=True
to np.histogram2d
:
normed: bool, optional
If
False
, returns the number of samples in each bin. IfTrue
, returns the bin densitybin_count / sample_count / bin_area
.
If you wanted to compute the bin areas and do the normalization manually , the simplest way would probably be to use broadcasting:
x, y = np.random.rand(2, 1000)
xbin = np.logspace(-1, 0, 101)
ybin = np.logspace(-1, 0, 201)
# raw bin counts
counts, xe, ye = np.histogram2d(x, y, [xbin, ybin])
# size of each bin in x and y dimensions
dx = np.diff(xbin)
dy = np.diff(ybin)
# compute the area of each bin using broadcasting
area = dx[:, None] * dy
# normalized counts
manual_norm = counts / area / x.shape[0]
# using normed=True
counts_norm, xe, ye = np.histogram2d(x, y, [xbin, ybin], normed=True)
print(np.allclose(manual_norm, counts_norm))
# True
Upvotes: 2