Efficient way to find the index of the max upper triangular entry in a numpy array?

Question

More specifically, I have a list of rows/columns that need to be ignored when choosing the max entry. In other words, when choosing the max upper triangular entry, certain indices need to skipped. In that case, what is the most efficient way to find the location of the max upper triangular entry?

For example:

>>> a
array([[0, 1, 1, 1],
       [1, 2, 3, 4],
       [4, 5, 6, 6],
       [4, 5, 6, 7]])
>>> indices_to_skip = [0,1,2]

I need to find the index of the min element among all elements in the upper triangle except for the entries a[0,1], a[0,2], and a[1,2].

Daniel · Accepted Answer

You can use np.triu_indices_from:

>>> np.vstack(np.triu_indices_from(a,k=1)).T
array([[0, 1],
       [0, 2],
       [0, 3],
       [1, 2],
       [1, 3],
       [2, 3]])

>>> inds=inds[inds[:,1]>2] #Or whatever columns you want to start from.
>>> inds
array([[0, 3],
       [1, 3],
       [2, 3]])


>>> a[inds[:,0],inds[:,1]]
array([1, 4, 6])

>>> max_index = np.argmax(a[inds[:,0],inds[:,1]])
>>> inds[max_index]
array([2, 3]])

Or:

>>> inds=np.triu_indices_from(a,k=1)
>>> mask = (inds[1]>2) #Again change 2 for whatever columns you want to start at.
>>> a[inds][mask]
array([1, 4, 6])

>>> max_index = np.argmax(a[inds][mask])
>>> inds[mask][max_index]
array([2, 3]])

For the above you can use inds[0] to skip certains rows.

To skip specific rows or columns:

def ignore_upper(arr, k=0, skip_rows=None, skip_cols=None):
    rows, cols = np.triu_indices_from(arr, k=k)

    if skip_rows != None:
        row_mask = ~np.in1d(rows, skip_rows)
        rows = rows[row_mask]
        cols = cols[row_mask]

    if skip_cols != None:
        col_mask = ~np.in1d(cols, skip_cols)
        rows = rows[col_mask]
        cols = cols[col_mask]

    inds=np.ravel_multi_index((rows,cols),arr.shape)
    return np.take(arr,inds)

print ignore_upper(a, skip_rows=1, skip_cols=2) #Will also take numpy arrays for skipping.
[0 1 1 6 7]

The two can be combined and creative use of boolean indexing can help speed up specific cases.

Something interesting that I ran across, a faster way to take upper triu indices:

def fast_triu_indices(dim,k=0):

    tmp_range = np.arange(dim-k)
    rows = np.repeat(tmp_range,(tmp_range+1)[::-1])

    cols = np.ones(rows.shape[0],dtype=np.int)
    inds = np.cumsum(tmp_range[1:][::-1]+1)

    np.put(cols,inds,np.arange(dim*-1+2+k,1))
    cols[0] = k
    np.cumsum(cols,out=cols)
    return (rows,cols)

Its about ~6x faster although it does not work for k<0:

dim=5000
a=np.random.rand(dim,dim)

k=50
t=time.time()
rows,cols=np.triu_indices(dim,k=k)
print time.time()-t
0.913508892059

t=time.time()
rows2,cols2,=fast_triu_indices(dim,k=k)
print time.time()-t
0.16515994072

print np.allclose(rows,rows2)
True

print np.allclose(cols,cols2)
True

Efficient way to find the index of the max upper triangular entry in a numpy array?

Answers (1)

Related Questions