Reputation: 1423
The results are correct. But in my real problem, the data are too large, so I want to directly apply interpolation with out using for loop. Any ideas would be appreciated.
import numpy as np
from scipy.interpolate import interp1d
data = np.array([[99,0,3,4,5],
[6,7,0,9,10],
[11,22,0,14,15]],dtype=np.float32)
data[data==0] = np.nan
def gap_fill(y):
not_nan = ~np.isnan(y)
x = np.arange(len(y))
interp = interp1d(x[not_nan], y[not_nan], kind='linear')
ynew = interp(x)
return ynew
results = []
for d in data:
gapfilled = gap_fill(d)
results.append(gapfilled)
print results
[array([ 99., 51., 3., 4., 5.]), array([ 6., 7., 8., 9., 10.]), array([ 11., 22., 18., 14., 15.])]
Upvotes: 2
Views: 1365
Reputation: 231738
What I was thinking of, on the spur of the moment, was:
In [8]: gap_fill(data.flatten()).reshape(data.shape)
Out[8]:
array([[ 99., 51., 3., 4., 5.],
[ 6., 7., 8., 9., 10.],
[ 11., 22., 18., 14., 15.]])
That works for your example because all the nan
are internal to the rows. However for elements on the ends of the rows, this turns extrapolation into interpolation across rows, which you probably don't want.
Strictly speaking linear interpolation is finding the value BETWEEN two points, (1-a)*x1+a*x2
, where 0<=a<=1
. If a
is outside of that range, that's linear extrapolation.
The default action in interp1
is to raise an error in extrapolation cases. Since your iterative gap_fill
runs, you must not have any extrapolation cases. In which case my flatten solution should work fine.
It doesn't look like interp1d
uses any C code for liner interpolation. Also looking at its documentation, you might gain some speed by adding copy=False, assume_sorted=True
.
Its core action is:
slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
y_new = slope*(x_new - x_lo)[:, None] + y_lo
Upvotes: 2