Reputation:
I have some data (x,y,z) lying on an unstructured grid and I would like to interpolate the data for visualization purposes.
I have already tried scipy.interpolate.griddata
, the interpolation assumes everywhere the same value. After that I tried scipy.interpolate.Rbf
, but this gets me a memory error (see code below).
Is there another method or other options which improve the result?
Result-->
My code
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata, Rbf
x, y, z = np.loadtxt('stackoverflow_example-data')
# griddata
points = np.reshape(np.array([x, y]),(z.size, 2))
grid_x, grid_y = np.mgrid[x.min():x.max():1000j,y.min():y.max():1000j]
counts_I_grid_1 = griddata(points, z, (grid_x, grid_y), method='nearest')
counts_I_grid_2 = griddata(points, z, (grid_x, grid_y), method='linear', fill_value=0)
counts_I_grid_3 = griddata(points, z, (grid_x, grid_y), method='cubic', fill_value=0)
# Rbf -- fails due to memory error
#rbf = Rbf(x,y,z)
#counts_I_Rbf = rbf(grid_x,grid_y)
Traceback (most recent call last): File "/path/code.py", line 14, in rbf = Rbf(x,y,z) File "/[...]/python3.4/site-packages/scipy/interpolate/rbf.py", line 198, in init r = self._call_norm(self.xi, self.xi) File "/[...]/python3.4/site-packages/scipy/interpolate/rbf.py", line 222, in _call_norm return self.norm(x1, x2) File "/[...]/python3.4/site-packages/scipy/interpolate/rbf.py", line 114, in _euclidean_norm return sqrt(((x1 - x2)**2).sum(axis=0)) MemoryError
# plot the result
fig = plt.figure()
ax1 = plt.subplot(2,2,1)
plt.title('Data')
plt.gca().set_aspect((x.max() - x.min()) / (y.max() - y.min()))
plt.scatter(x, y, c=z, s=2, edgecolor='', marker=',')
plt.colorbar(ax=ax1)
plt.xlim(x.min(), x.max())
plt.ylim(y.min(), y.max())
plt.xticks([20.7,20.9,21.1,21.3])
plt.ticklabel_format(useOffset=False)
ax2 = plt.subplot(2,2,2)
plt.title('nearest')
plt.imshow(counts_I_grid_1.T, origin='lower',
extent=(x.min(), x.max(), y.min(), y.max()),
aspect=(x.max() - x.min()) / (y.max() - y.min()),
vmin=0,vmax=36)
plt.colorbar(ax=ax2)
plt.xticks([20.7,20.9,21.1,21.3])
plt.ticklabel_format(useOffset=False)
ax2 = plt.subplot(2,2,3)
plt.title('linear')
plt.imshow(counts_I_grid_2.T, origin='lower',
extent=(x.min(), x.max(), y.min(), y.max()),
aspect=(x.max() - x.min()) / (y.max() - y.min()),
vmin=0,vmax=36)
plt.colorbar(ax=ax2)
plt.xticks([20.7,20.9,21.1,21.3])
plt.ticklabel_format(useOffset=False)
ax2 = plt.subplot(2,2,4)
plt.title('cubic')
plt.imshow(counts_I_grid_3.T, origin='lower',
extent=(x.min(), x.max(), y.min(), y.max()),
aspect=(x.max() - x.min()) / (y.max() - y.min()),
vmin=0,vmax=36)
plt.colorbar(ax=ax2)
plt.xticks([20.7,20.9,21.1,21.3])
plt.ticklabel_format(useOffset=False)
plt.tight_layout()
plt.show()
Upvotes: 1
Views: 4630
Reputation: 35080
Your problem is due to a subtle mistake that is dangerous enough that I believe it's worth giving a full answer.
Consider this line:
points = np.reshape(np.array([x, y]),(z.size, 2))
Since your input arrays are 1d, you're trying to transform [x,y]
into shape "(something, 2)". Note that it's valid to say
points = np.reshape(np.array([x, y]),(-1, 2))
to let numpy infer the missing dimension for you, and this would still not be what you want. When you construct the 2d array
np.array([x, y])
you're defining a matrix row by row, thereby creating an array of shape "(2, something)". When you call reshape
on it, it will read the elements row-by-row by default, since numpy stores arrays in row-major order (like C/C++, and unlike fortran and MATLAB). This means that the resulting two-column array will first contain all the x
values, then all the y
values, rather than containing pairs of (x,y)
in each row.
What you're actually trying to do is to swap the dimensions of your array, without touching its structure. This means that you have to transpose your matrix. This means that you have to use
points = np.array([x, y]).T
# or np.transpose([x,y])
instead. Note that while this has the same shape
as your original, it has the elements in the proper order:
In [320]: np.reshape(np.array([x,y]),(-1,2))
Out[320]:
array([[ 20.7 , 20.702 ],
[ 20.704 , 20.706 ],
[ 20.708 , 20.71 ],
...,
[ 852.356964, 852.356964],
[ 852.356964, 852.356964],
[ 852.356964, 852.356964]])
In [321]: np.array([x,y]).T
Out[321]:
array([[ 20.7 , 852.357235],
[ 20.702 , 852.357235],
[ 20.704 , 852.357235],
...,
[ 21.296 , 852.356964],
[ 21.298 , 852.356964],
[ 21.3 , 852.356964]])
This will fix the inconsistency between your sample x/y
points and z
, and will produce the expected results. In my experience, reshape
-ing is almost never called for. Often you need to flatten out an ndarray
into a 1d
one, but then the ravel()
method of the given array is best.
Results as proof: left, your original with cubic interpolation; right, the version with points
fixed:
Note that as @MaxNoe suggested, I reduced your interpolating mesh size to 200x200. And as they hinted, your memory error with Rbf
most probably stemmed from this huge number of interpolating points.
Upvotes: 3