Reputation: 8247
I have following dataframe in pandas
job_desig salary
senior analyst 12
junior researcher 5
scientist 20
sr analyst 12
Now I want to generate one column which will have a flag set as below
sr = ['senior','sr']
job_desig salary senior_profile
senior analyst 12 1
junior researcher 5 0
scientist 20 0
sr analyst 12 1
I am doing following in pandas
df['senior_profile'] = [1 if x.str.contains(sr) else 0 for x in
df['job_desig']]
Upvotes: 3
Views: 319
Reputation: 862511
You can join all values of list by |
for regex OR
, pass to Series.str.contains
and last cast to integer for True/False
to 1/0
mapping:
df['senior_profile'] = df['job_desig'].str.contains('|'.join(sr)).astype(int)
If necessary, use word boundaries:
pat = '|'.join(r"\b{}\b".format(x) for x in sr)
df['senior_profile'] = df['job_desig'].str.contains(pat).astype(int)
print (df)
job_desig salary senior_profile
0 senior analyst 12 1
1 junior researcher 5 0
2 scientist 20 0
3 sr analyst 12 1
Soluttion with sets, if only one word values in list:
df['senior_profile'] = [int(bool(set(sr).intersection(x.split()))) for x in df['job_desig']]
Upvotes: 5
Reputation: 405
You can just do it by simply using str.contains
df['senior_profile'] = df['job_desig'].str.contains('senior') | df['job_desig'].str.contains('sr')
Upvotes: 3