Mazz
Mazz

Reputation: 820

Map dictionary values to key values in a dataframe column

I have a dataframe:

values
 NaN
 NaN
 [1,2,5]
 [2]
 [5]

And a dictionary

{nan: nan,
'1': '10',
 '2': '11',
 '5': '12',}

The dataframe contains keys from the dictionary.

How can I replace these keys with the corresponding values from the same dictionary?

Output:

values
 NaN
 NaN
 [10,11,12]
 [11]
 [12]

I have tried

so_df['values'].replace(my_dictionary, inplace=True)
so_df.head()

Upvotes: 1

Views: 137

Answers (3)

ansev
ansev

Reputation: 30920

Setup

import numpy as np
df=pd.DataFrame({'values':[np.nan,np.nan,[1,2,5],[2],5]})
my_dict={np.nan: np.nan, '1': '10', '2': '11', '5': '12'}

Use Series.explode with Series.map

df['values']=( df['values'].explode()
                       .astype(str)
                       .map(my_dict)
                       .dropna()
                       .astype(int)
                       .groupby(level = 0)
                       .agg(list) )

If there are others strings in your values column you would need pd.to_numeric with errors = coerce, to keep it you should do:

df['values']=(pd.to_numeric( df['values'].explode()
                                         .astype(str)
                                         .replace(my_dict),
                             errors = 'coerce')
                .dropna()
                .groupby(level = 0)
                .agg(list) 
                .fillna(df['values'])
               )

Output

         values
0           NaN
1           NaN
2  [10, 11, 12]
3          [11]
4          [12]

UPDATE

solution without explode

df['values']=(pd.to_numeric( df['values'].apply(pd.Series)
                                         .stack()
                                         .reset_index(level=1,drop=1)
                                         .astype(str)
                                         .replace(my_dict),
                         errors = 'coerce')
                 .dropna()
                 .groupby(level = 0)
                 .agg(list) 
                 .fillna(df['values'])
         )

Upvotes: 1

Ajay Maity
Ajay Maity

Reputation: 760

You can use apply() method of pandas df. Check the implementation below:

import pandas as pd
import numpy as np

df = pd.DataFrame([np.nan,
 np.nan,
 ['1', '2', '5'],
 ['2'],
 ['5']], columns=['values'])

my_dict = {np.nan: np.nan,
 '1': '10',
 '2': '11',
 '5': '12'}

def update(row):
    if isinstance(row['values'], list):
        row['values'] = [my_dict.get(val) for val in row['values']]
    else:
        row['values'] = my_dict.get(row['values'])
    return row

df = df.apply(lambda row: update(row), axis=1)

Simple implementation. Just make sure if your dataframe contains string, your dictionary keys also contains string.

Upvotes: 3

luigigi
luigigi

Reputation: 4215

Try:

df['values']=pd.to_numeric(df['values'].explode().astype(str).map(my_dict), errors='coerce').groupby(level=0).agg(list)

Upvotes: 1

Related Questions