zesla
zesla

Reputation: 11793

how to map a column that contains multiple strings according to a dictionary in pandas

I have a dataframe in which one column contains strings separated by comma. I want map the columns according to a dictionary.

For example:

dfm = pd.DataFrame({'Idx': np.arange(4), 'Names': ['John,Mary', 'Mike', 'Mike,Joe,Mary', 'John']})
mask = {'John':'1', 'Mary':'2','Joe':'3','Mike':'4'}

Desired Output:

    Idx Names
0   0   1,2
1   1   4
2   2   4,3,2
3   3   1

What's the best way to achieve that? Thanks.

Upvotes: 1

Views: 518

Answers (2)

Haleemur Ali
Haleemur Ali

Reputation: 28253

It is possible to pass a function to the .str.replace function that we can use in this case

dfm.Names.str.replace('\w+(?=,|$)', lambda m: mask.get(m.group(0)))

Using this, it is possible to create a new data frame as such:

pd.DataFrame({
    'Idx': dfm.Idx, 
    'Names': dfm.Names.str.replace('\w+(?=,|$)', lambda m: mask.get(m.group(0)))
})
# outputs:
   Idx  Names
0    0    1,2
1    1      4
2    2  4,3,2
3    3      1

Upvotes: 0

abhilb
abhilb

Reputation: 5757

You can try this:

>>> dfm.Names.apply(lambda x: ','.join([mask[i] for i in x.split(',')]))
0      1,2
1        4
2    4,3,2
3        1
Name: Names, dtype: object

Upvotes: 1

Related Questions