Reputation: 490
I have data that is grouped by origin and destination pairs. I would like to compare the first row in the group against the other rows in the group. If a row meets certain conditions, then I would like to flag it. Below is a snapshot of my data:
Orig Dest Route Vol Per VolPct
ORD ICN A 2,251 0.64 0.78
ORD ICN B 366 0.97 0.13
ORD ICN C 142 0.14 0.05
ORD ICN D 100 0.22 0.03
ORD ICN E 25 1.00 0.01
If a row has a "VolPct" of >=.1, then it gets compared against the 1st row. So in this example, Route B would be compared against Route A. If the difference between "Per" of Route B and Route A is >= .2, then I'd like to flag Route B by putting an "X" in a new column. In this example, Route B would get flagged. Expected output would look like this:
Orig Dest Route Vol Per VolPct Flag
ORD ICN A 2,251 0.64 0.78
ORD ICN B 366 0.97 0.13 x
ORD ICN C 142 0.14 0.05
ORD ICN D 100 0.22 0.03
ORD ICN E 25 1.00 0.01
Any suggestions how to go about this? I'm using Pandas and Python 3.6. Thanks for your help!
Upvotes: 4
Views: 573
Reputation: 402503
IIUC, you can do this with groupby
+ first
. Use np.where
for determining values for Flag
.
i = df.VolPct >= .1
j = (df.Per - df.groupby('Orig').Per.transform('first') >= .2)
df['Flag'] = np.where(i & j, 'X', '')
df
Orig Dest Route Vol Per VolPct Flag
0 ORD ICN A 2,251 0.64 0.78
1 ORD ICN B 366 0.97 0.13 X
2 ORD ICN C 142 0.14 0.05
3 ORD ICN D 100 0.22 0.03
4 ORD ICN E 25 1.00 0.01
Upvotes: 5