Reputation: 598
Suppose I have a pandas DataFrame:
RowNum | PageName | OfInterest |
---|---|---|
0 | home | False |
1 | photo | False |
2 | list | True |
3 | photo | False |
4 | photo | False |
5 | photo | False |
6 | home | False |
7 | photo | False |
OfInterest
value for all rows with PageName=photo
should be set to True
only if they follow PageName=list
.
In my desired output, rows 3,4,5
will be changed, but not rows 1, 7
:
RowNum | PageName | OfInterest |
---|---|---|
0 | home | False |
1 | photo | False |
2 | list | True |
3 | photo | True |
4 | photo | True |
5 | photo | True |
6 | home | False |
7 | photo | False |
I attempted to do this using apply()
but it seems that I cannot access the most recently changed values.
def changeInterest(x):
followsOfInterest = (x['PageName'] == 'photo') and (x['PrevOfInterest'])
return followsOfInterest or x['OfInterest']
df['PrevOfInterest'] = df['OfInterest'].shift(-1)
df['PrevOfInterest'] = df[['PageName', 'OfInterest', 'PrevOfInterest']].apply(changeInterest, axis=1)
I know I can accomplish the same using a loop, but I would like to find a more elegant solution.
Upvotes: 1
Views: 67
Reputation: 75080
You can try replace and ffill here , then just compare if the ffilled value is 'list'
s = df['PageName'].replace('photo',np.nan).ffill().eq('list')|df['OfInterest']
df['OfInterest'] = s
print(df)
RowNum PageName OfInterest
0 0 home False
1 1 photo False
2 2 list True
3 3 photo True
4 4 photo True
5 5 photo True
6 6 home False
7 7 photo False
Upvotes: 3