Python: Extract the indices of repeated rows corresponding to the non-zero unique rows in a matrix

Question

For this matrix K=

 [[1.  2.  3.]
 [ 0.  0.  0.]
 [ 4.  5.  6.]
 [ 0.  0.  0.]
 [ 4.  5.  6.]
 [ 0.  0.  0.]]

How to store the list/array of indices of repeated rows corresponding to the non-zero unique rows in a matrix.

In this example:[0,2] are the indices of non-zero unique rows.

Question: How to store this information in a dictionary:

   corresponding value for key 0: [0]
   corresponding value for key 2: [2,4]

Thanks!

jpp · Accepted Answer

Here is one method via collections.defaultdict. It iterates via a for loop with enumerate and uses set to track seen items.

You can easily remove (0, 0, 0) from the dictionary at the end, and rename keys if necessary. The method is O(n) in complexity.

from collections import defaultdict

A = np.array([[ 1,  2,  3],
              [ 0,  0,  0],
              [ 4,  5,  6],
              [ 0,  0,  0],
              [ 4,  5,  6],
              [ 0,  0,  0]])

seen = {(0, 0, 0)}
d = defaultdict(list)

for idx, row in enumerate(map(tuple, A)):
    d[row].append(idx)

Result:

print(d)

defaultdict(list, {(0, 0, 0): [1, 3, 5],
                   (1, 2, 3): [0],
                   (4, 5, 6): [2, 4]})

Python: Extract the indices of repeated rows corresponding to the non-zero unique rows in a matrix

Answers (2)

Related Questions