Reputation: 8247
I have a pandas dataframe like this..
order_id buyer_id scheduled_order minutes flag
525 232 1 13 Null
862 232 1 14 Null
1361 232 1 15 Null
1373 232 1 13 Null
1580 232 1 14 Null
1729 232 0 11 Null
1817 232 1 18 Null
I want to set a flag depending upon value of scheduled_order
.
If first order is scheduled order(scheduled_order =1
) flag should set to 0
, else it should check if minutes are greater than 12 then flag should be 1 else 2 Then, for next order if previous order is scheduled order then, flag should set to 3. If previous order is live order(scheduled_order =0
) and if minutes is less than 12 then flag should set to 2. if minute is greater than 12 then flag should set to 1.
My desired output is
order_id buyer_id scheduled_order minutes flag
525 232 1 13 0
862 232 1 14 3
1361 232 1 15 3
1373 232 1 13 3
1580 232 1 14 3
1729 232 0 11 3
1817 232 1 18 2
Here is my code in python
for i in range(len(df)):
if(df.scheduled_order[i] == '1'):
speed.flag[i] = '0'
else:
if(minutes > 12):
df.flag[i] = '1'
else:
df.flag[i] = '2'
But when i
becomes 1 how do I check for previous scheduled_order
value?
Upvotes: 1
Views: 4579
Reputation: 210832
try this:
from __future__ import print_function
import pandas as pd
# create DataFrame from the CSV file
df = pd.read_csv('data.csv', delimiter=r'\s+')
# set flag to 3, for all rows where previous 'scheduled_order' == 1
# except first row
df.ix[(df.index > 0) & (df['scheduled_order'].shift(1) == 1), ['flag']] = 3
# set flag to 1, for all rows where previous 'scheduled_order' != 1
# and minutes > 12
# except first row
df.ix[(df.index > 0) & (df['scheduled_order'].shift(1) != 1) & (df['minutes'] > 12), ['flag']] = 1
# set flag to 2, for all rows where previous 'scheduled_order' != 1
# and minutes <= 12, except first row
df.ix[(df.index > 0) & (df['scheduled_order'].shift(1) != 1) & (df['minutes'] <= 12), ['flag']] = 2
# set flag for the first row ...
if df.ix[0]['scheduled_order'] == 1:
df.ix[0, ['flag']] = 0
else:
if df.ix[0]['minutes'] > 12:
df.ix[0, ['flag']] = 1
else:
df.ix[0, ['flag']] = 2
print(df)
Output:
order_id buyer_id scheduled_order minutes flag
0 525 232 1 13 0
1 862 232 1 14 3
2 1361 232 1 15 3
3 1373 232 1 13 3
4 1580 232 1 14 3
5 1729 232 0 11 3
6 1817 232 1 18 1
PS i've followed your algorithm, that's why i have (flag == 1) for the last row. If it's not what you want, please clarify the algorithm.
IF you want to compare with "previous" minutes, then make the following replacement: df['minutes']
--> df['minutes'].shift(1)
, so that the output will be exactly the same as yours.
Upvotes: 1
Reputation: 11034
Add a column with the previous scheduled order value:
df['prev_scheduled_order'] = df.scheduled_order.shift(1)
Upvotes: 0
Reputation: 22021
Access prev scheduled_order
as scheduled_order[i-1]
.
Note that you can access in that way when i > 0
otherwise your code would be buggy, because you would access last element of list.
Upvotes: 0
Reputation: 134
You could assign scheduled_order to another temp variable and compare
Upvotes: 0