Reputation: 616
I have a dataframe:
Name Segment Axis 1 2 3 4 5
0 Amazon 1 slope NaN 100 120 127 140
1 Amazon 1 x 0.0 1.0 2.0 3.0 4.0
2 Amazon 1 y 0.0 0.4 0.8 1.2 1.6
3 Amazon 2 slope NaN 50 57 58 59
4 Amazon 2 x 0.0 2.0 4.0 6.0 8.0
5 Amazon 2 y 0.0 1.0 2.0 3.0 4.0
df2:
Name Segment Optimal Cost
Amazon 1 115
Amazon 2 60
Netflix 1 100
Netflix 2 110
I am trying to compare the slope values in the axis column to the corresponding optimal cost values and extract the slope, x and y values.
The rule is: Find the last first slope value greater than its corresponding optimal cost
If there is no value greater than optimal cost, then report where slope is zero.
If there are only values greater than optimal cost, then report highest y value
Expected output:
Name Segment slope x y
0 Amazon 1 120 2 0.8
1 Amazon 2 NaN 0 0
With help of @wwnde . Current code is
s=df.set_index(['Name' , 'Segment','Axis']).stack().unstack('Axis')
s=s.dropna(subset=["slope"]).sort_values("slope").reset_index(level=2, drop=True)
df3=pd.merge(s, df2, on=['Name', 'Segment'], how='left')
df3[df3['slope']>df3['Optimal_Cost']].groupby(['Name',
'Segment']).first().reset_index()
But this only returns values if it finds the last first slope value greater than its corresponding optimal cost.
Upvotes: 0
Views: 67
Reputation: 24322
Let's continue with @wwnde solution with some changes in it:
s=df.set_index(['Name','Segment','Axis']).stack().unstack(2)
s=s.sort_values("slope").reset_index(level=2, drop=True)
#In above code we don't have to drop nan
out=pd.merge(s, df2, on=['Name', 'Segment'], how='left')
cond=out['slope'].gt(out['Optimal Cost']) | out['slope'].isna()
#make changes in condition to include nan's
out=out[cond].groupby(['Name','Segment'],as_index=False).first().drop('Optimal Cost',1)
Output of out
:
Name Segment slope x y
0 Amazon 1 120.0 2.0 0.8
1 Amazon 2 NaN 0.0 0.0
Upvotes: 3