exclude zeros in Numpy quantile calculation of rows of an array

Question

I have a 2D-array with zero values in each row.

[[5, 3, 2, 0, 0, 1, 6, 9, 11, 1, 4, 1],
 [0, 0, 12, 0, 1, 0, 0, 2, 0, 30, 2, 2],
 [120, 2, 10, 3, 0, 0, 2, 7, 9, 5, 0, 0]]

Is there a way to calculate the 0.75 quantile of each row by excluding the zero values in the calculation ?

For example, in the second row, only 6 non-zero values[12,1,2,30,2,2] should be used in the calculation. I tried using np.quantile() but it will includes all zero values in the calculation. It seems that Numpy don't have masked array np.ma version of quantile() also.

Chong Onn Keat · Accepted Answer

You can replace the zero values with nan and pass the array into np.nanquantile() to compute the quantile of non-nan values

>>> arr = np.array([[5, 3, 2, 0, 0, 1, 6, 9, 11, 1, 4, 1],
                    [0, 0, 12, 0, 1, 0, 0, 2, 0, 30, 2, 2],
                    [120, 2, 10, 3, 0, 0, 2, 7, 9, 5, 0, 0]], dtype='f')
 
>>> arr[arr==0] = np.nan
>>> arr
[[  5.   3.   2.  nan  nan   1.   6.   9.  11.   1.   4.   1.]
 [ nan  nan  12.  nan   1.  nan  nan   2.  nan  30.   2.   2.]
 [120.   2.  10.   3.  nan  nan   2.   7.   9.   5.  nan  nan]]

>>> arr_quantile75 = np.nanquantile(arr, 0.75, axis=1)  #by row-axis
>>> arr_quantile75
[5.75 9.5  9.25]

np.nanquantile() compute the qth quantile of the data along the specified axis, while ignoring nan values[source]

exclude zeros in Numpy quantile calculation of rows of an array

Answers (1)

Related Questions