betelgeuse
betelgeuse

Reputation: 459

Ignore NaN in numpy bincount in python

I have a 1D array and I want to use numpy bincount to create a histogram. It works OK, but I want it to ignore NaN values.

histogram = np.bincount(distancesArray, weights=intensitiesArray) / np.bincount(distancesArray)

How can I do that?

Thanks for your help!

Upvotes: 5

Views: 4469

Answers (3)

ZJS
ZJS

Reputation: 4051

You could convert it to a pandas Series and remove the null values.

ds = pd.Series(distancesArray)
ds = ds[ds.notnull()]   #returns non nullvalues

Upvotes: 0

Davidmh
Davidmh

Reputation: 3865

You cannot have NaN in an integer valued array. If you try to call np.bincount, it is going to complain:

TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'

If you do the casting (.astype(int)), you will get crazy values, like -9223372036854775808. You can overcome this by selecting the non NaN values:

mask = ~np.logical_or(np.isnan(distancesArray), np.isnan(intensitiesArray))
histogram = np.bincount(distancesArray[mask].astype(int), 
                        weights=intensitiesArray[mask])

Upvotes: 2

Veedrac
Veedrac

Reputation: 60137

Here's what I think your problem is:

import numpy

w = numpy.array([0.3, float("nan"), 0.2, 0.7, 1., -0.6]) # weights
x = numpy.array([0, 1, 1, 2, 2, 2])
numpy.bincount(x,  weights=w)
#>>> array([ 0.3,  nan,  1.1])

The solution is just to use indexing to only keep the non-nan weights:

keep = ~numpy.isnan(w)
numpy.bincount(x[keep],  weights=w[keep])
#>>> array([ 0.3,  0.2,  1.1])

Upvotes: 7

Related Questions