Reputation: 487
I have a massive data set where I need to split my plot into a grid and count the number of points within each grid square. I'm following a method outlined here:
with a stripped-down version of my code below:
import numpy as np
import matplotlib.pyplot as plt
x = [ 1.83259571, 1.76278254, 1.38753676, 1.6406095, 1.34390352, 1.23045712, 1.85877565, 1.26536371, 0.97738438]
y = [ 0.04363323, 0.05235988, 0.09599311, 0.10471976, 0.1134464, 0.13962634, 0.17453293, 0.20943951, 0.23561945]
gridx = np.linspace(min(x),max(x),11)
gridy = np.linspace(min(y),max(y),11)
grid, _, _ = np.histogram2d(x, y, bins=[gridx, gridy])
plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)
plt.figure()
plt.pcolormesh(gridx, gridy, grid)
plt.plot(x, y, 'ro')
plt.colorbar()
plt.show()
Where the problem arises is the grid is identifying elements of the plot as where points are appearing yet there are no points within some of those elements; similarly, where some of the actual data points appear the grid does not recognize them as not actually being there.
What might be causing this problem? Also, sorry for not attaching the plot, I'm a new user and my reputation isn't high enough.
UPDATE Here's a code that generates 100 random points and attempts to plot them in a 2-D histogram:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.rand(100)
y = np.random.rand(100)
gridx = np.linspace(0,1,11)
gridy = np.linspace(0,1,11)
grid, __, __ = np.histogram2d(x, y, bins=[gridx, gridy])
plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)
plt.figure()
plt.pcolormesh(gridx, gridy, grid)
plt.plot(x, y, 'ro')
plt.colorbar()
plt.show()
Yet when I run it I have the same problem as before: the locations of the points and the colors corresponding to point-location-density don't agree. Does this happen when anyone runs this code for themselves?
SECOND UPDATE
And at the risk of beating a dead horse, here's a code for a parametric plot:
import numpy as np
import matplotlib.pyplot as plt
t = np.linspace(0,1,100)
x = np.sin(t)
y = np.cos(t)
gridx = np.linspace(0,1,11)
gridy = np.linspace(0,1,11)
#grid, __, __ = np.histogram2d(x, y, bins=[gridx, gridy])
grid, __, __ = np.histogram2d(x, y)
plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)
plt.figure()
plt.pcolormesh(gridx, gridy, grid)
plt.plot(x, y, 'ro')
plt.colorbar()
plt.show()
which makes me think this is all some kind of weird scaling issue. Still totally lost though...
Upvotes: 5
Views: 8233
Reputation: 1
Referencing numpy histogram2d's documentation...
The careful reader will note that the parameters are backwards.
histogram2d(y, x, bins=(xedges, yedges)
Compute the bi-dimensional histogram of two data samples.
Parameters
x : array_like, shape (N,) An array containing the x coordinates of the points to be histogrammed.
y : array_like, shape (N,) An array containing the y coordinates of the points to be histogrammed.
Ergo, you supplied your x's to the y parameter of the function and vice versa for the x's.
Regards
Upvotes: 0
Reputation: 4164
I was able to get your example to work by using imshow with interpolation instead of pcolormesh. See sample code below.
I think the problem may be that pcolormesh has a different origin convention than plot. The results of pcolormesh look like the upper left and lower right are flipped.
The result with imshow looks like:
The sample code:
import numpy as np
import matplotlib.pyplot as plt
def doPlot():
x = [ 1.83259571, 1.76278254, 1.38753676, 1.6406095, 1.34390352, 1.23045712, 1.85877565, 1.26536371, 0.97738438]
y = [ 0.04363323, 0.05235988, 0.09599311, 0.10471976, 0.1134464, 0.13962634, 0.17453293, 0.20943951, 0.23561945]
gridx = np.linspace(min(x),max(x),11)
gridy = np.linspace(min(y),max(y),11)
H, xedges, yedges = np.histogram2d(x, y, bins=[gridx, gridy])
plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)
#wrong origin convention for pcolormesh?
#plt.figure()
#plt.pcolormesh(gridx, gridy, H)
#plt.plot(x, y, 'ro')
#plt.colorbar()
plt.figure()
myextent =[xedges[0],xedges[-1],yedges[0],yedges[-1]]
plt.imshow(H.T,origin='low',extent=myextent,interpolation='nearest',aspect='auto')
plt.plot(x,y,'ro')
plt.colorbar()
plt.show()
if __name__=="__main__":
doPlot()
Upvotes: 2