Fluxy
Fluxy

Reputation: 2978

How to get median values across diagonal lines in a matrix?

I have the following matrix in pandas:

import numpy as np
import pandas as pd

df_matrix = pd.DataFrame(np.random.random((10, 10)))

I need to get a vector that contains 10 median values, 1 value across each blue line as shown in the picture below:

enter image description here

The last number in the output vector is basically 1 number rather than a median.

Upvotes: 2

Views: 124

Answers (2)

fiphrelin
fiphrelin

Reputation: 320

X = np.random.random((10, 10))
fX = np.fliplr(X) # to get the "other" diagonal
np.array([np.median(np.diag(fX, k=-k)) for k in range(X.shape[0])])

Upvotes: 5

Quang Hoang
Quang Hoang

Reputation: 150735

The diagonals are such that row_num + col_num = constant. So you can use stack and sum the rows/cols and groupby:

(df_matrix.stack().reset_index(name='val')
   .assign(diag=lambda x: x.level_0+x.level_1)  # enumerate the diagonals
   .groupby('diag')['val'].median()             # median by diagonal
   .loc[len(df_matrix):]                        # lower triangle diagonals
)

Output (for np.random.seed(42)):

diag
9     0.473090
10    0.330898
11    0.531382
12    0.440152
13    0.548075
14    0.325330
15    0.580145
16    0.427541
17    0.248817
18    0.107891
Name: val, dtype: float64

Upvotes: 1

Related Questions