Reputation: 1998
I have a column in my df that looks like below:
Service
DoorDash, Grubhub / Seamless, UberEats, Postmates
DoorDash, UberEats, Caviar, Tock
DoorDash
None
Caviar, Tock
None
Tock
DoorDash, Grubhub / Seamless, UberEats, Postmates
Grubhub / Seamless, UberEats
Is there an efficient manner in which I can create a new column for each service and if that service exists in the Service column, the value in the new column will be a boolean of True or False?
So if I have a list of service names such as:
DoorDash, Grubhub / Seamless, UberEats, Caviar, Postmates, JustEat, Deliveroo, Foodora, Grab, Talabat
I want to create a column for each of the names in the above list and have a value of True or False depending on if that service exists in the Service column?
Expected Output:
Service | DoorDash | Grubhub / Seamless | UberEats| Caviar | Postmates | JustEat | Deliveroo | Foodora | Grab | Talabat | Tock
DoorDash, Grubhub / Seamless, UberEats, Postmates True True True False True False False False False False False
DoorDash, UberEats, Caviar, Tock True False True True False False False False False False True
DoorDash True False False False False False False False False False False
None False False False False False False False False False False False
Caviar, Tock False False False True False False False False False False True
None False False False False False False False False False False False
Tock False False False False False False False False False False True
DoorDash, Grubhub / Seamless, UberEats, Postmates True True True False True False False False False False False
Grubhub / Seamless, UberEats False True True False False False False False False False False
Thank you for looking
Upvotes: 1
Views: 80
Reputation: 862661
Use Series.str.get_dummies
with convert to boolean, add missing values by list in DataFrame.reindex
and last add to original:
L = ['DoorDash', 'Grubhub / Seamless', 'UberEats', 'Caviar',
'Postmates', 'JustEat', 'Deliveroo', 'Foodora', 'Grab', 'Talabat']
df1 = (df.join(df['Service'].str.get_dummies(', ')
.astype(bool)
.reindex(L, axis=1, fill_value=False)))
print (df1)
Service DoorDash \
0 DoorDash, Grubhub / Seamless, UberEats, Postmates True
1 DoorDash, UberEats, Caviar, Tock True
2 DoorDash True
3 None False
4 Caviar, Tock False
5 None False
6 Tock False
7 DoorDash, Grubhub / Seamless, UberEats, Postmates True
8 Grubhub / Seamless, UberEats False
Grubhub / Seamless UberEats Caviar Postmates JustEat Deliveroo \
0 True True False True False False
1 False True True False False False
2 False False False False False False
3 False False False False False False
4 False False True False False False
5 False False False False False False
6 False False False False False False
7 True True False True False False
8 True True False False False False
Foodora Grab Talabat
0 False False False
1 False False False
2 False False False
3 False False False
4 False False False
5 False False False
6 False False False
7 False False False
8 False False False
Upvotes: 1