Lukasz
Lukasz

Reputation: 2606

Delete row and column in symmetric array if all the values in a row (or column) do not satisfy a given contion

I've got a sparse, symmetric array and I'm trying to delete a row and column of that array if all the individual entries of a given row (and column) do not satisfy some threshold condition. For example if

min_value = 2
a = np.array([[2, 2, 1, 0, 0], 
              [2, 0, 1, 4, 0], 
              [1, 1, 0, 0, 1], 
              [0, 4, 0, 1, 0], 
              [0, 0, 1, 0, 0]])

I would like to keep the rows (and columns) where the it has at least a value of 2 or more, so that with the above example this would yield

a_new = np.array([2, 2, 0],
                 [2, 0, 4], 
                 [0, 4, 1]] 

So I would lose rows 3 and 5 (and columns 3 and 5) since every entry is less then 2. I've had a look at How could I remove the rows of an array if one of the elements of the row does not satisfy a condition?, Delete columns based on repeat value in one row in numpy array and Delete a column in a multi-dimensional array if all elements in that column satisfy a condition but the marked solutions do not fit what I'm attempting to accomplish.

I was thinking of performing something similar to:

a_new = []
min_count = 2

for row in a:
    for i in row:
        if i >= min_count:
            a_new.append(row)
    print(items)
print(temp)

but this doesn't work since it doesn't delete a bad column and if there are two (or more) instances where a value is greater then the threshold it append a row multiple times.

Upvotes: 1

Views: 552

Answers (1)

Divakar
Divakar

Reputation: 221524

You could have a vectorized solution to solve it as shown below -

# Get valid mask
mask = a >= min_value

# As per requirements, look for ANY match along rows and cols and 
# use those masks to index into row and col dim of input array with
# 1D open meshes from np.ix_ and thus select a 2D slice out of it
out = a[np.ix_(mask.any(1),mask.any(0))]

A simpler way to express it would be by selecting rows and then columns, like so -

a[mask.any(1)][:,mask.any(0)]

Abusing the symmetric nature of the input array, it would simplify to -

mask0 = (a>=min_value).any(0)
out = a[np.ix_(mask0,mask0)]

Sample run -

In [488]: a
Out[488]: 
array([[2, 2, 1, 0, 0],
       [2, 0, 1, 4, 0],
       [1, 1, 0, 0, 1],
       [0, 4, 0, 1, 0],
       [0, 0, 1, 0, 0]])

In [489]: min_value
Out[489]: 2

In [490]: mask0 = (a>=min_value).any(0)

In [491]: a[np.ix_(mask0,mask0)]
Out[491]: 
array([[2, 2, 0],
       [2, 0, 4],
       [0, 4, 1]])

Alternatively, we can use row and column indices of valid mask, like so -

r,c = np.where(a>=min_value)
out = a[np.unique(r)[:,None],np.unique(c)]

Again abusing the symmetric nature, the simplified version would be -

r = np.unique(np.where(a>=min_value)[0])
out = a[np.ix_(r,r)]

r could also be obtained with a mix of boolean operations -

r = np.flatnonzero((a>=min_value).any(0))

Upvotes: 1

Related Questions