joshsol
joshsol

Reputation: 61

How to find max from dynamic number of columns

I have two dataframes: A: 20*15 matrix of numbers B: 20*1 list of numbers (from 1-15).

I would like to find the max number on each row in table A, however only looking at the columns from table B

Simplified example below.

Thanks!

+-----------------+
|       A:        |
+-----------------+
| 7  3  5  4      |
| 8  1  2  5      |
| 2  3  7  2      |
| 4  1  3  6      |


+-----------------+
|       B:        |
+-----------------+
| 2               |
| 4               |
| 1               |
| 2               |

| Desired result: |
| 7               |
| 8               |
| 2               |
| 4               |

Upvotes: 2

Views: 383

Answers (3)

rafaelc
rafaelc

Reputation: 59274

Using pd.DataFrame.where and np.ones

m = np.ones(dfa.shape).cumsum(1)
dfa.where(m <= dfb.to_numpy()).max(1)

Can also use

m = np.broadcast_to(np.arange(len(dfa)) + 1, dfa.shape)

0    7.0
1    8.0
2    2.0
3    4.0
dtype: float64

Upvotes: 3

BENY
BENY

Reputation: 323326

pandas solution

S=A.stack()
S[B.reindex(S.index.get_level_values(0)).values>=S.index.get_level_values(1)].max(level=0)
Out[276]: 
0    7
1    8
2    2
3    4
dtype: int64

Upvotes: 2

ALollz
ALollz

Reputation: 59579

where + max

You want to find maximum value in the first n columns for each row, where n is from your second dataframe. So mask the cells that are not important then take the max as max ignores NaN by default.

import numpy as np

m = np.arange(dfa.shape[1]) < dfb[0][:, None]  # Thanks rafaelc
dfa.where(m).max(1)

#0    7.0
#1    8.0
#2    2.0
#3    4.0
#dtype: float64

Sample Data:

dfa
   0  1  2  3
0  7  3  5  4
1  8  1  2  5
2  2  3  7  2
3  4  1  3  6

dfb
   0
0  2
1  4
2  1
3  2

Upvotes: 3

Related Questions