Reputation: 478
I am trying to multiply the leading diagonal in a pandas dataframe and I am not sure how to proceed in a computationally reasonable way.
df = [ 3 4 5
6 7 8
9 10 11]
ouput_df = [231 32 5
60 77 8
9 10 11]
Explanation - lookoing to 3 * 7 * 11 for the first element, 4 * 8 for the second element, 7 * 11 for the fifth element etc.,
Note: The matrix I am working on is not a square matrix, but a rectangular matrix.
Upvotes: 3
Views: 354
Reputation: 2188
Here's a method that operates on the DataFrame in place.
df = pd.DataFrame(data=[[3, 4, 5], [6, 7, 8], [9, 10, 11]])
m, n = df.shape
for i in range(-m + 1, n):
ri, rj = max(-i, 0), min(m - 1, n - i - 1)
ci, cj = max( i, 0), min(n - 1, m + i - 1)
np.fill_diagonal(df.values[ri:rj+1,ci:cj+1],
df.values.diagonal(i)[::-1].cumprod()[::-1])
print(df)
Result:
0 1 2
0 231 32 5
1 60 77 8
2 9 10 11
Upvotes: -1
Reputation: 221524
Here's one based on NumPy -
def cumprod_upper_diag(a):
m,n = a.shape
mask = ~np.tri(m,n, dtype=bool)
p = np.ones((m,n),dtype=a.dtype)
p[mask[:,::-1]] = a[mask]
a[mask] = p[::-1].cumprod(0)[::-1][mask[:,::-1]]
return a
a = df.to_numpy(copy=False) # For older versions : a = df.values
out = a.copy()
cumprod_upper_diag(out)
cumprod_upper_diag(out.T)
out.ravel()[::a.shape[1]+1] = out.ravel()[::out.shape[1]+1][::-1].cumprod()[::-1]
out_df = pd.DataFrame(out)
Upvotes: 3
Reputation: 51165
You can use a sparse
diagonal matrix here with some finnicking. This assumes all non-zero elements in your original matrix, or else this will not work.
from scipy import sparse
a = df.to_numpy()
b = sparse.dia_matrix(a)
c = b.data[:, ::-1]
cp = np.cumprod(np.where(c != 0, c, 1), axis=1)
b.data = cp[:, ::-1]
b.A
array([[231, 32, 5],
[ 60, 77, 8],
[ 9, 10, 11]], dtype=int64)
Upvotes: 2
Reputation: 150735
As Chris mentioned, this is cumprod
in reverse order:
# stack for groupby
new_df = df.stack().reset_index()[::-1]
# diagonals meaning col_num - row_num are the same
diags = new_df['level_0']-new_df['level_1']
# groupby diagonals
new_df['out'] = new_df.groupby(diags)[0].cumprod()
# pivot to get the original shape
new_df.pivot('level_0', 'level_1', 'out')
output:
level_1 0 1 2
level_0
0 231 32 5
1 60 77 8
2 9 10 11
Upvotes: 1