christianbauer1
christianbauer1

Reputation: 536

Python: Counting Zeros in multiple array columns and store them efficently

I create an array:

import numpy as np
arr = [[0, 2, 3], [0, 1, 0], [0, 0, 1]]
arr = np.array(arr)

Now I count every zero per column and store it in a variable:

a = np.count_nonzero(arr[:,0]==0)
b = np.count_nonzero(arr[:,1]==0)
c = np.count_nonzero(arr[:,2]==0)

This code works fine. But in my case I have many more columns with over 70000 values in each. This would be many more lines of code and a very messy variable expolorer in spyder.

My questions:

  1. Is there a possibility to make this code more efficient and save the values only in one type of data, e.g. a dictionary, dataframe or tuple?
  2. Can I use a loop for creating the dic, dataframe or tuple?

Thank you

Upvotes: 3

Views: 3676

Answers (3)

tstanisl
tstanisl

Reputation: 14157

To count zeros you can count non-zeros along each column and subtract result from length of each column:

arr.shape[0] - np.count_nonzero(arr, axis=0)

produces [3,1,1].

This solution is very fast because no extra large objects are created.

Upvotes: 1

Gustav Rasmussen
Gustav Rasmussen

Reputation: 3961

Use an ordered dict from the collections module:

from collections import OrderedDict
import numpy as np
from pprint import pprint as pp
import string

arr = np.array([[0, 2, 3], [0, 1, 0], [0, 0, 1]])
letters = string.ascii_letters
od = OrderedDict()

for i in range(len(arr)):
    od[letters[i]] = np.count_nonzero(arr[:, i]==0)

pp(od)

Returning:

OrderedDict([('a', 3), ('b', 1), ('c', 1)])

Example usage:

print(f"First number of zeros: {od.get('a')}")

Will give you:

First number of zeros: 3

Upvotes: 0

timgeb
timgeb

Reputation: 78750

You can construct a boolean array arr == 0 and then take its sum along the rows.

>>> (arr == 0).sum(0)
array([3, 1, 1])

Upvotes: 5

Related Questions