Reputation: 2955
Here's a brief example of a function. It maps a vector to a vector. However, entries that are NaN or inf should be ignored. Currently this looks rather clumsy to me. Do you have any suggestions?
from scipy import stats
import numpy as np
def p(vv):
mask = np.isfinite(vv)
y = np.NaN * vv
v = vv[mask]
y[mask] = 1/v*(stats.hmean(v)/len(v))
return y
Upvotes: 0
Views: 1737
Reputation: 885
Masked arrays accomplish this functionality and allow you to specify the mask as you desire. The numpy 1.18 docs for it are here: https://numpy.org/doc/1.18/reference/maskedarray.generic.html#what-is-a-masked-array
In masked arrays, False mask values are used in calculations, while True are ignored for calculations.
Example for obtaining the mean of only the finite values using np.isfinite()
:
import numpy as np
# Seeding for reproducing these results
np.random.seed(0)
# Generate random data and add some non-finite values
x = np.random.randint(0, 5, (3, 3)).astype(np.float32)
x[1,2], x[2,1], x[2,2] = np.inf, -np.inf, np.nan
# array([[ 4., 0., 3.],
# [ 3., 3., inf],
# [ 3., -inf, nan]], dtype=float32)
# Make masked array. Note the logical not of isfinite
x_masked = np.ma.masked_array(x, mask=~np.isfinite(x))
# Mean of entire masked matrix
x_masked.mean()
# 2.6666666666666665
# Masked matrix's row means
x_masked.mean(1)
# masked_array(data=[2.3333333333333335, 3.0, 3.0],
# mask=[False, False, False],
# fill_value=1e+20)
# Masked matrix's column means
x_masked.mean(0)
# masked_array(data=[3.3333333333333335, 1.5, 3.0],
# mask=[False, False, False],
# fill_value=1e+20)
Note that scipy.stats.hmean()
also works with masked arrays.
Note that if all you care about is detecting NaNs and leaving inf
s, then you can use np.isnan()
instead of np.isfinite()
.
Upvotes: 1
Reputation: 2955
I have came up with this kind of construction:
from scipy import stats
import numpy as np
## operate only on the valid entries of x and use the same mask on the resulting vector y
def __f(func, x):
mask = np.isfinite(x)
y = np.NaN * x
y[mask] = func(x[mask])
return y
# implementation of the parity function
def __pp(x):
return 1/x*(stats.hmean(x)/len(x))
def pp(vv):
return __f(__pp, vv)
Upvotes: 1
Reputation: 569
You can change the NaN values to zero with Numpy's isnan function and then remove the zeros as follows:
import numpy as np
def p(vv):
# assuming vv is your array
# use Nympy's isnan function to replace the NaN values in the array with zero
replace_NaN = np.isnan(vv)
vv[replace_NaN] = 0
# convert array vv to list
vv_list = vv.tolist()
new_list = []
# loop vv_list and exclude 0 values:
for i in vv_list:
if i != 0:
new.list.append(i)
# set array vv again
vv = np.array(new_list, dtype = 'float64')
return vv
Upvotes: 1