Reputation: 1695
I need to perform an interpolation of some Nan
values in a 2d numpy array, see for example the following picture:
In my current approach I use scipy.interpolate.griddata
for the interpolation procedure. However I noticed that when
mirroring the array on both axis i.e. d2 = d[::-1, ::-1]
the interpolation gives different results.
Here is a complete example :
import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate as interp
def replace_outliers(f):
mask = np.isnan(f)
lx, ly = f.shape
x, y = np.mgrid[0:lx, 0:ly]
z = interp.griddata(np.array([x[~mask].ravel(),y[~mask].ravel()]).T,
f[~mask].ravel(),
(x,y), method='linear', fill_value=0)
return z
def main():
d = np.load('test.npy')
d2 = d[::-1, ::-1]
dn = replace_outliers(d)
dn2 = replace_outliers(d2)
print np.sum(dn - dn2[::-1, ::-1])
plt.imshow(dn-dn2[::-1, ::-1], interpolation='nearest')
plt.colorbar()
plt.show()
if __name__=='__main__':
main()
This gives the difference between the two interpolations:
or as evaluated by np.sum
its about -62.7
So how can it be that a simple mirroring of the array gives different results in the interpolation procedure? Is there maybe something wrong with the coordinates I use ?
Upvotes: 1
Views: 1892
Reputation: 35145
The reason likely is that the linear interpolation is triangle-based. However, such a square grid is a degenerate case for Delaunay triangulation, and the triangulation is not unique. I can imagine the outcome depends on the order of the data points.
For a missing data point, I would guess the two cases correspond to different triangulations of the empty space:
A A
* * * *---*---* *---*---*
| / \ | | / | \ |
* * => D*-------*B or D* | *B
| \ / | | \ | / |
* * * *---*---* *---*---*
C C
If you now compute the value at the center, you get (B+D)/2 from one triangulation and (A+C)/2 from the other.
Upvotes: 4