Reputation: 8033

Max & Min for the rows based on conditions in Pandas (column name dependent)

I have a df as below

Astrt   Aend    Bstrt   Bend    Xstrt   Xend    Ystrt   Yend
25        27    15        16    11        12    40      42
50        51    45        46    23        25    35      36
14        15    21        20    8          9    2        3
11        11    45        49    46        47    12       13

In this df, columns starting with A are linked with columns starting with X and columns starting with B are linked with columns starting with Y.

What I want is

Identify the largest of ALL columns that has `strt', that is Astrt,Bstrt,Xstrt & Ystrt and put in a new column 'Tsrt'
Then the corresponding 'end' value in anew column 'Tend'. So, for example if is 'Bstrt' is the biggest, then 'Bend' (does not matter if it is biggest or not) goes into column 'Tend'
As I said, columns starting with A are linked with columns starting with X and columns starting with B are linked with columns starting with Y. So the 2 new columns 'Ustrt' & 'Uend' should be filled with start & end values of respective columns For ex, if 'Xstrt' is the biggest, then 'Ustrt' & 'Uend' column will have values from 'Astrt' & 'Aend' (no analysis needed, just directly the values).

So basically, we need to find the max of ALL columns that has `strt', identify the name of that & copy values based on its relationship.

The expected out of the df above is as below.

Astrt   Aend    Bstrt   Bend    Xstrt   Xend    Ystrt   Yend    Tstrt   Tend    Ustrt   Uend
25        27       15   16         11   12         40   42        40    42         15   16
50        51       45   46         23   25         35   36        50    51         23   25
14        15       21   20          8   9           2   3         21    20          2   3
11        11       45   49         46   47         12   13        46    47         11   11

Hopefully, it is clear what i am trying to do. Can someone help? Thanks.

Upvotes: 3

Answers (3)

sammywemmy

Reputation: 28729

Another solution :

# create temporary dataframes for start and end
strt = df.filter(regex=".strt$")
end = df.filter(regex=".end$")

Tstrt = strt.max(1).array    
cond = strt.isin(Tstrt)
cond = cond.to_numpy()
Tend = end.to_numpy()[cond]

# reshape boolean array based on positions (A replaces X, B replaces Y and vice versa)
Ustrt = strt.to_numpy()[cond[:, [2, 3, 0, 1]]]
Uend = end.to_numpy()[cond[:, [2, 3, 0, 1]]]

df.assign(Tstrt=Tstrt, Tend=Tend, Ustrt=Ustrt, Uend=Uend)

No idea how to properly paste wide dataframe in Stack OVerflow :

    Astrt   Aend    Bstrt   Bend    Xstrt   Xend    Ystrt   Yend    Tstrt   Tend    Ustrt   Uend
0   25     27      15      16      11      12      40      42     40       42   15  16
1   50     51      45      46      23      25      35      36     50       51   23  25
2   14     15      21      20      8       9       2       3      21       20   2   3
3   11     11      45      49      46      47      12     13      46       47   11  11

Upvotes: 0

moys

Reputation: 8033

Thanks to @Andy.L & @Henry Yik, I was able to figure our another way of achieving what I wanted to achieve. Putting it here just for sake of documentation.

a = df.filter(like='strt').idxmax(axis=1).str[0]
d = {"X":"A", "A":"X", "Y":"B", "B":"Y"}
df['Tstrt'] = df.filter(like='strt').max(axis=1)
df['Tend']=df.lookup(df.index,a+'end')
df['Ustrt']=df.lookup(df.index,a.map(d)+'strt')
df['Uend']=df.lookup(df.index,a.map(d)+'end')
df


Astrt   Aend    Bstrt   Bend    Xstrt   Xend    Ystrt   Yend    Tstrt   Tend    Ustrt   Uend
25       27     15       16     11       12     40        42       40      42   15      16
50       51     45       46     23       25     35        36       50      51   23      25
14       15     21       20     8         9     2          3       21      20   2       3
11       11     45       49     46       47     12        13       46      47   11      11

Upvotes: 1

Andy L.

Reputation: 25269

Use filter, lookup and replace to construct those columns:

df_strt = df.filter(regex='strt$')
df_end = df.filter(regex='end$')
s = df_strt.idxmax(1)
d = {"X":"A", "A":"X", "Y":"B", "B":"Y"}
df['Tstrt'] = df_strt.lookup(*zip(*s.items()))
df['Tend'] = df_end.lookup(*zip(*s.str.replace('strt', 'end').items()))
df['Ustrt'] = df_strt.lookup(*zip(*s.replace(d,regex=True).items()))
df['Uend'] = (df_end.lookup(*zip(*s.str.replace('strt', 'end')
                                       .replace(d,regex=True).items())))

Out[202]:
   Astrt  Aend  Bstrt  Bend  Xstrt  Xend  Ystrt  Yend  Tstrt  Tend  Ustrt  Uend
0     25    27     15    16     11    12     40    42     40    42     15    16
1     50    51     45    46     23    25     35    36     50    51     23    25
2     14    15     21    20      8     9      2     3     21    20      2     3
3     11    11     45    49     46    47     12    13     46    47     11    11

Upvotes: 2

Max &amp; Min for the rows based on conditions in Pandas (column name dependent)

Answers (3)

Related Questions

Max & Min for the rows based on conditions in Pandas (column name dependent)