Reputation: 447
I have data in the below format :
pastLocation | currentLocation
delhi | bangalore
delhi | london,pune,delhi
mumbai | mumbai
pune | pune, noida
I have to create a new column named as changeInLocation
where if pastLocation
is present in currentLocation
then value of new column would be 0
else 1
.
For example, in second row, pastLocation
i.e. Delhi is present in corresponding currentLocation
so value of changeInLocation
should be 0
Output should be in following format:
pastLocation | currentLocation | changeInLocation
delhi | bangalore | 1
delhi | london,pune,delhi | 0
mumbai | mumbai | 0
pune | pune, noida | 0
Upvotes: 4
Views: 97
Reputation: 12417
Similar solution of jezrael(which is anyway more complete), but without casting:
df['changeInLocation']=df.apply(lambda x: 1 if x['pastLocation'] in x['currentLocation'] else 0, axis=1)
Upvotes: 2
Reputation: 164843
Similar to jezrael's solution, but taking care to remove whitespace and use set
for performance:
import pandas as pd
df = pd.DataFrame({'pastLocation': ['delhi', 'delhi', 'mumbai', 'pune'],
'currentLocation': ['bangalore', 'london,pune,delhi',
'mumbai', 'pune, noida']})
sets = [{i.strip() for i in row} for row in df['currentLocation'].str.split(',').values]
df['changeInLocation'] = [int(past not in current) for past, current in \
zip(df['pastLocation'], sets)]
print(df)
currentLocation pastLocation changeInLocation
0 bangalore delhi 1
1 london,pune,delhi delhi 0
2 mumbai mumbai 0
3 pune, noida pune 0
Upvotes: 2
Reputation: 863741
Use apply
with in
for check membership and then cast to int
:
df['changeInLocation'] = df.apply(lambda x: x['pastLocation'] not in x['currentLocation'], axis=1).astype(int)
Another solution iz zip columns and use list comprehension
:
df['changeInLocation'] = [int(a not in b) for a, b in zip(df['pastLocation'], df['currentLocation'])]
print (df)
pastLocation currentLocation changeInLocation
0 delhi bangalore 1
1 delhi london,pune,delhi 0
2 mumbai mumbai 0
3 pune pune, noida 0
Upvotes: 4