Reputation: 79
I have a pandas dataframe column that looks as follows:
col1
200
300
400
200
500
700
0
0
60
0
0
I'm trying to create a new column that is determined based on the current row value but taking into account the next few row values also. So e.g. if (row_col1(i) > 60) & ((row_col1(i+1)+row_col1(i+2)+row_col1(i+3)) > 100), write 'yes' in col2.
col1 col2
200 yes
300 yes
400 yes
200 yes
500 yes
700 yes
0 no
0 no
60 no
0 no
0 no
Any ideas on how to achieve this?
Upvotes: 0
Views: 417
Reputation: 881
You can handle this in a DataFrame using the .shift()
method. Refer to this link for an example.
Here's one approach which uses zip
and list
:
l1 = df['col1'].tolist()
l2 = []
for a,b,c,d in zip(l1,l1[1:],l1[2:],l1[3:]):
if a>60 & (b+c+d)>100:
l2 += ['yes']
else:
l2 += ['no']
l2 += ['','',''] # cater for the last 3 entries. I've left it blank, you can decide whether to go for 'yes' or 'no' or 'NA'
df['col2']= pd.Series(l2)
Upvotes: 1