Reputation: 61
I have a fairly large dataframe:
A | B | C | D | |
---|---|---|---|---|
0 | 17 | 36 | 45 | 54 |
1 | 18 | 23 | 17 | 17 |
2 | 74 | 47 | 8 | 46 |
3 | 48 | 38 | 96 | 83 |
I am trying to create a new column that is the (max value of the columns) - (2nd highest value) / (2nd highest value).
In this example it would look something like:
A | B | C | D | Diff | |
---|---|---|---|---|---|
0 | 17 | 36 | 45 | 54 | .20 |
1 | 18 | 23 | 17 | 17 | .28 |
2 | 74 | 47 | 8 | 46 | .57 |
3 | 48 | 38 | 96 | 83 | .16 |
I've tried df['diff'] = df.loc[:, 'A': 'D'].max(axis=1) - df.iloc[:df.index.get_loc(df.loc[:, 'A': 'D'].idxmax(axis=1))] / ...
but even that part of the formula returns an error, nevermind including the final division. I'm sure there must be an easier way going about this.
Edit: Additionally, I am also trying to get the difference between the max value and the column that immediately precedes the max value. I know this is a somewhat different question, but I would appreciate any insight. Thank you!
Upvotes: 6
Views: 312
Reputation: 323226
Let us try get the second Max value with mask
Max = df.max(1)
secMax = df.mask(df.eq(Max,0)).max(1)
df['Diff'] = (Max - secMax)/secMax
df
Out[69]:
A B C D Diff
0 17 36 45 54 0.200000
1 18 23 17 17 0.277778
2 74 47 8 46 0.574468
3 48 38 96 83 0.156627
Upvotes: 1
Reputation: 75080
We can also make use of numpy sort and np.diff :
arr = np.sort(df,axis=1)[:,-2:]
df['Diff'] = np.diff(arr,axis=1)[:,0]/arr[:,0]
print(df)
A B C D Diff
0 17 36 45 54 0.200000
1 18 23 17 17 0.277778
2 74 47 8 46 0.574468
3 48 38 96 83 0.156627
Upvotes: 2
Reputation: 29742
One way using pandas.Series.nlargest
with pct_change
:
df["Diff"] = df.apply(lambda x: x.nlargest(2).pct_change(-1)[0], axis=1)
Output:
A B C D Diff
0 17 36 45 54 0.200000
1 18 23 17 17 0.277778
2 74 47 8 46 0.574468
3 48 38 96 83 0.156627
Upvotes: 4
Reputation: 150735
One way is to apply a udf:
def get_pct(x):
xmax2, xmax = x.sort_values().tail(2)
return (xmax-xmax2)/xmax2
df['Diff'] = df.apply(get_pct, axis=1)
Output:
A B C D Diff
0 17 36 45 54 0.200000
1 18 23 17 17 0.277778
2 74 47 8 46 0.574468
3 48 38 96 83 0.156627
Upvotes: 3