Reputation: 1
I want to create a new column in a dataframe based on if a certain value comes after another. Example my df looks like:
edit:(the values can contain (A,N,E,V))
Name Jan Feb Mar April
John N N E E
Alex E E E E
Alice E E E N
I'd like to have a new column 'designation' that returns 1 if any person changed from an 'N' value from previous months to an 'E' and '0' if anything else. Essentially this dataframe is from an excel file updated everymonth, so the latest month will appear in the last column.
I have a code
def func(u,x,y,z):
if u =='N' and x== 'E':
return 1
elif x =='N' and y == 'E':
return 1
elif y =='N' and z == 'E':
return 1
else:
return 0
Essentially I want a code that doesn't need to be updated for every month and can just use the column indexes to return the value.
Upvotes: 0
Views: 39
Reputation: 15738
Compare with the same dataframe shifted right. Since 'N' > 'E', casting to int will generate what you want
>>> df = pd.DataFrame({'Jan': list('NEE'), 'Feb': list('NEE'), 'Mar': 'E', 'April': list('EEN')})
>>> df
Jan Feb Mar April
0 N N E E
1 E E E E
2 E E E N
>>> (df < df.shift(1, axis=1)).astype(int)
Jan Feb Mar April
0 0 0 1 0
1 0 0 0 0
2 0 0 0 0
If you want to do the same for the last column only, just
(df.iloc[:, -1] < df.iloc[:, -2]).astype(int)
UPD: in case there are other letters, it should be:
(( df.shift(1, axis=1) == 'N') & (df == 'E')).astype(int)
or, for a single column:
((df.iloc[:, -2] == 'N') & (df.iloc[:, -1] == 'E')).astype(int)
Upvotes: 2