Reputation: 61
I have two dataframes: A: 20*15 matrix of numbers B: 20*1 list of numbers (from 1-15).
I would like to find the max number on each row in table A, however only looking at the columns from table B
Simplified example below.
Thanks!
+-----------------+
| A: |
+-----------------+
| 7 3 5 4 |
| 8 1 2 5 |
| 2 3 7 2 |
| 4 1 3 6 |
+-----------------+
| B: |
+-----------------+
| 2 |
| 4 |
| 1 |
| 2 |
| Desired result: |
| 7 |
| 8 |
| 2 |
| 4 |
Upvotes: 2
Views: 383
Reputation: 59274
Using pd.DataFrame.where
and np.ones
m = np.ones(dfa.shape).cumsum(1)
dfa.where(m <= dfb.to_numpy()).max(1)
Can also use
m = np.broadcast_to(np.arange(len(dfa)) + 1, dfa.shape)
0 7.0
1 8.0
2 2.0
3 4.0
dtype: float64
Upvotes: 3
Reputation: 323326
pandas
solution
S=A.stack()
S[B.reindex(S.index.get_level_values(0)).values>=S.index.get_level_values(1)].max(level=0)
Out[276]:
0 7
1 8
2 2
3 4
dtype: int64
Upvotes: 2
Reputation: 59579
where
+ max
You want to find maximum value in the first n
columns for each row, where n
is from your second dataframe. So mask the cells that are not important then take the max as max
ignores NaN
by default.
import numpy as np
m = np.arange(dfa.shape[1]) < dfb[0][:, None] # Thanks rafaelc
dfa.where(m).max(1)
#0 7.0
#1 8.0
#2 2.0
#3 4.0
#dtype: float64
Sample Data:
dfa
0 1 2 3
0 7 3 5 4
1 8 1 2 5
2 2 3 7 2
3 4 1 3 6
dfb
0
0 2
1 4
2 1
3 2
Upvotes: 3