Reputation: 11793
I have a dataframe in which one column contains strings separated by comma. I want map the columns according to a dictionary.
For example:
dfm = pd.DataFrame({'Idx': np.arange(4), 'Names': ['John,Mary', 'Mike', 'Mike,Joe,Mary', 'John']})
mask = {'John':'1', 'Mary':'2','Joe':'3','Mike':'4'}
Desired Output:
Idx Names
0 0 1,2
1 1 4
2 2 4,3,2
3 3 1
What's the best way to achieve that? Thanks.
Upvotes: 1
Views: 518
Reputation: 28253
It is possible to pass a function to the .str.replace
function that we can use in this case
dfm.Names.str.replace('\w+(?=,|$)', lambda m: mask.get(m.group(0)))
Using this, it is possible to create a new data frame as such:
pd.DataFrame({
'Idx': dfm.Idx,
'Names': dfm.Names.str.replace('\w+(?=,|$)', lambda m: mask.get(m.group(0)))
})
# outputs:
Idx Names
0 0 1,2
1 1 4
2 2 4,3,2
3 3 1
Upvotes: 0
Reputation: 5757
You can try this:
>>> dfm.Names.apply(lambda x: ','.join([mask[i] for i in x.split(',')]))
0 1,2
1 4
2 4,3,2
3 1
Name: Names, dtype: object
Upvotes: 1