Liky
Liky

Reputation: 1247

Mapping diagonal

Let's say I have the following dataframe:

idx = ['H',"A","B","C","D"]
idxp = idx[1:] + [idx[0]]
idxm = [idx[-1]] + idx[:-1]
idx, idxp, idxm
j = np.arange(25).reshape(5,5)
J = pd.DataFrame(j, index=idx, columns=idx)
np.fill_diagonal(J.values, 0)
J

enter image description here

As an output, I would like get the array such that:

In other words, that would give us the matrix below:

m_exp = np.array([[0,1,8,21,40],
             [0,0,7,20,39],
             [0,0,0,13,32],
             [0,0,0,0,19],
             [0,0,0,0,0],
             ])

The best way I have found so far to calculate this matrix is by using the code below:

travelup = np.array([np.pad(np.cumsum(J.values.diagonal(1)[n:]), (n+1,0), 'constant') for n in range(J.values.shape[0])])

However this involved a comprehension list and in practice my matrix is much bigger and this code is called thousands of time.

Is there any way to transform the process by using a mapping to make it faster avoiding looping?

Upvotes: 2

Views: 242

Answers (1)

Divakar
Divakar

Reputation: 221504

Few methods are listed.

I. Basic method

a = J.values
p = np.r_[0,a.ravel()[1::a.shape[1]+1]] # or np.r_[0,np.diag(a,1)]
n = len(p)
out = np.triu(np.broadcast_to(p,(n,n)),1).cumsum(1)

p and n would be re-used in alternatives listed next.

A. Alternative #1

Alternatively with broadcasted-multiplication to get the final output -

out = (~np.tri(n, dtype=bool)*p).cumsum(1)

B. Alternative #2

Alternatively with outer-subtraction on cumsum -

c = p.cumsum()
out = np.triu(c-c[:,None])

C. Alternative #3

Alternatively with np.tri to replace np.triu -

out = (c-c[:,None])*~np.tri(n, dtype=bool)

c would be re-used in alternatives listed next.

II. With numexpr

For large arrays, leverage multi-cores with numexpr. Hence, the alternatives would be -

import numexpr as ne

out = ne.evaluate('(c-c2D)*M',{'c2D':c[:,None],'M':~np.tri(n, dtype=bool)})

A. Alternative #1

out = ne.evaluate('(c-c2D)*(~M)',{'c2D':c[:,None],'M':np.tri(n, dtype=bool)})

B. Alternative #2

r = np.arange(n)
out = ne.evaluate('(c-c2D)*(r2D<r)',{'c2D':c[:,None],'r2D':r[:,None]})

Upvotes: 2

Related Questions