Reputation: 3263
I need to fill in missing values (given as 0) in 2d matrix. How would I accomplish it in numpy/scipy? I found scipy.interpolate.interp2d function but I cannot quite understand how to make it fill in zeros only without modifying non-zero entries.
Here is example of this function being used to smooth out the image https://scipython.com/book/chapter-8-scipy/examples/scipyinterpolateinterp2d/
but this is not what I am looking for. I just want to fill out zero values.
For example, the matrix is
import numpy as np
mat = np.array([[1,2,0,0,4], [1,0,0,0,8], [0,4,2,2,0], [0,0,0,0,8], [1,0,0,0,1]])
mat
array([[1, 2, 0, 0, 4],
[1, 0, 0, 0, 8],
[0, 4, 2, 2, 0],
[0, 0, 0, 0, 8],
[1, 0, 0, 0, 1]])
In this matrix, all zeros must be replaced with interpolated values while original values should remain the same. What can I use for this task?
Upvotes: 2
Views: 2686
Reputation: 1097
You have to make a decision how you want to fill in the zeros. For example, you could just use the average value in the array:
mat[mat == 0] = np.average(mat)
mat
# array([[1, 2, 1, 1, 4],
# [1, 1, 1, 1, 8],
# [1, 4, 2, 2, 1],
# [1, 1, 1, 1, 8],
# [1, 1, 1, 1, 1]])
or you could use the values from some function fitted to the nonzero values --- scipy.interpolate.interp2d
uses a "spline" (think polynomial):
from scipy.interpolate import interp2d
ix = np.where(mat != 0)
f = interp2d(ix[0], ix[1], mat[ix].flatten(), kind='linear')
mat2 = mat.copy()
mat2[mat==0] = f(range(5), range(5)).T[mat==0]
mat2
# array([[ 1, 2, 3, 4, 4],
# [ 1, 1, 1, 1, 8],
# [ 4, 4, 2, 2, 11],
# [ 4, 3, 2, 1, 8],
# [ 1, 0, 0, 0, 1]])
although I think you will find this approach pretty finicky, especially for such a small dataset.
You could also have a look at other imputation approaches, like nearest neighbors, etc.
Upvotes: 2