Yun Tae Hwang
Yun Tae Hwang

Reputation: 1471

how to return number of certain values in an array using numpy

I need to return the number of non-reasonable (nan or out of range) values for the 3rd column where has 0s an a blank in it. I have to deal with a csv file in a real problem but I just created a ndarray for now.

data = np.array([[   1, 2000,  143, 4546], [   2, 1999,  246,    0], [   3, 2008,  190,    ], [   4, 2000,  100,    0]])

I cant even think where I should start.

It will be awesome if someone can help.

Upvotes: 0

Views: 57

Answers (1)

RagingRoosevelt
RagingRoosevelt

Reputation: 2164

First, you need to be able to access just the column that you're interested in. Do this with a slice:

data[:,2] # grab all rows, and just the column with index 2

Now you want to count the occurrences that are NaN:

np.count_nonzero(np.isnan(data[:,2]))

And we want to count the number of zero elements:

data[:,2].size - np.count_nonzero(data[:,2])

And if we add those together:

data[:,2].size - np.count_nonzero(data[:,2]) + np.count_nonzero(np.isnan(data[:,2]))

This is boring, though, since the 3rd column doesn't have any 0 or NaN in it. Lets try with the last column:

>>> slice = data[:,3]
>>> slice.size - np.count_nonzero(slice) + np.count_nonzero(np.isnan(slice))
3

edit I should explain why this works:

np.isnan(data[:,2]) gives an array of True and False based on if it's a NaN or not. True, when treated as a number, is converted to 1 and False is converted to0so thenp.count_nonzerocall counts the number of1which represent theNaN` values.

np.count_nonzero(data[:,2]) counts the number of non-zero directly. If we subtract the number of non-zero elements from the total number of elements, we'll get the number of 0s.

Upvotes: 1

Related Questions