Mike
Mike

Reputation: 155

Pandas: List of maximum values of difference from previous rows in new column

I want to add a new column 'BEST' to this dataframe, which contains a list of the names of the columns which meet these criteria:

Input:

   A  B  C
0  1  2  3
1  3  2  1
2  5  2  4
3 -3  0  2
4  6  3  4

Desired result:

   A  B  C        BEST
0  1  2  3          []
1  3  2  1          []
2  5  2  4       ['A']
3 -3  0  2       ['C']
4  6  3  4  ['A', 'B']

Explanation for the result in the new column 'BEST' for each row:

Code to generate the dataframe:

import io, pandas as pd

df = io.StringIO("""
A|B|C
1|2|3
3|2|1
5|2|4
-3|0|2
6|3|4
""")
df = pd.read_csv(df, sep='|')

Upvotes: 0

Views: 503

Answers (3)

Allen Qin
Allen Qin

Reputation: 19947

First use shift and subtract to get the diff, then replace the maximum values with the column name and drop the others.

df['BEST'] = (
    df.subtract(df.shift(2))
    .apply(lambda x: [df.columns[i] for i,e in enumerate(x) if e==max(x)], axis=1)
)

    A   B   C   BEST
0   1   2   3   []
1   3   2   1   []
2   5   2   4   [A]
3   -3  0   2   [C]
4   6   3   4   [A, B]

Upvotes: 3

Crystal L
Crystal L

Reputation: 571

You can use shift(), apply() and np.where to do so. Try this:

df['Best'] = (df - df.shift(2)).iloc[2:].apply(lambda x: list(df.columns[np.where(x==max(x))]), axis=1)

Output:

print(df)

A   B   C   Best
0   1   2   3   NaN
1   3   2   1   NaN
2   5   2   4   [A]
3   -3  0   2   [C]
4   6   3   4   [A, B]

Upvotes: 2

Anurag Dabas
Anurag Dabas

Reputation: 24314

Try:

maximum=df.sub(df.shift(2)).max(skipna=False,axis=1)
#finded the maximum value
df['Best']=(df.where(df.sub(df.shift(2)).isin(maximum))
          .T
          .apply(lambda x:x.dropna().index.tolist()))

Output of df:

    A   B   C   Best
0   1   2   3   []
1   3   2   1   []
2   5   2   4   [A]
3   -3  0   2   [C]
4   6   3   4   [A, B]

Upvotes: 3

Related Questions