Reputation: 43
So I have some data with around a million (r, phi)
coordinates, along with their intensities. I want to sample this data in a grid pattern so I can reduce memory used, and plot faster. However I want to sample the data in X,Y as I will be converting the coordinates to (X,Y)
coordinates to plot them.
I was thinking I could use a meshgrid to come up with a template I'd like to sample, but I'm stuck on the next step.
I can't seem to find anything useful searching on google or here, but apologies if this is too simple a question!
I'm using numpy and my data is stored as three seperate arrays right now. I was planning to use np.meshgrid
and later scipy.interpolate.griddata
for interpolation.
r
, phi
and intensity
are all np.array
s with shape (million,)
e.g.
r = array([1560.8, 1560.8003119, 1560.8006238, ..., 3556.831746,
3558.815873 , 3560.8 ])
I started with this;
r = data[:, 0] # radius
phi = data[:, 1] # altitude angle
h2o = data[:, 2] # intensity
x = r * np.sin(phi) # It's a left handed coordinate system
z = r * np.cos(phi)
And for the sampling grid I have got this;
Xscale = np.linspace(min(x), max(x), 1000)
Zscale = np.linspace(min(z), max(z), 1000)
[X, Z] = np.meshgrid(Xscale, Zscale)
Upvotes: 4
Views: 2441
Reputation: 1603
It would have been nice if you have provided some data to work on. It doesn't matter, we will create some.
Lets create x,y values from r,theta arbitrary values :
import numpy as np
import matplotlib.pyplot as plt
theta=np.linspace(0.,50.,1000)
r=np.linspace(5.,10,1000)
x=r*np.sin(theta)
y=r*np.cos(theta)
plt.plot(x,y,linestyle='',marker='.')
The plot gives :
Now add arbitrary intensity values :
intensity=np.sqrt(x**2+y**2)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x, y, intensity)
The scatter plot gives :
If I understand well we should not be far from your starting point. We have now 3 arrays with 1000 values. We are going to reduce it to a 20x20 mesgrid. We have to first create the x and y bins, then call the binned_statistic_2d method from scipy and that's it.
import scipy.stats as stats
binx=np.linspace(-10.,10.,20)
biny=np.linspace(-10.,10.,20)
ret = stats.binned_statistic_2d(x, y, intensity, 'mean', bins=[binx,biny])
Z=ret.statistic
Z = np.ma.masked_invalid(Z) # allow to mask Nan values got in bins where there is no value
X, Y = np.meshgrid(binx,biny)
plt.pcolor(X,Y,Z)
plt.show()
The pcolor plot gives :
As requested in your comment, we can now go back to the original x,y,z arrays structure.
First, we have to calculate the center coordinates of the bins
binx_centers=(binx[1:] + binx[:-1])/2
biny_centers=(biny[1:] + biny[:-1])/2
Xcenters, Ycenters = np.meshgrid(binx_centers,biny_centers)
Then we can get the not masked values (see explanation above)
xnew=np.ma.masked_array(Xcenters, Z.mask).compressed()
ynew=np.ma.masked_array(Ycenters, Z.mask).compressed()
znew=Z.compressed()
We can check the new size :
print(znew.shape)
Gives only 235 values (instead of 1000.):
(235L,)
And the new scatter plot with the compressed values :
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(xnew, ynew, znew)
We obtain :
Upvotes: 1