Reputation: 1241
I have a dataframe as
> print(df)
[Out:]
activity-code activity
-------------------------
0 unknown
99 NaN
84 sports
72;99 NaN
57 recreational
57;99;11 NaN
11 NaN
and a dictionary with activity-codes as keys,
> print(act_dict)
[Out:]
{10: 'unknown',
11: 'cultural',
57: 'recreational',
72: 'social service',
84: 'sports',
99: 'education'}
All the values inside the dataframe are stored as strings even the activity-code has values as string. Whereas the dictionary keys are of integer type I want to somehow map and replace with missing values in activity using the dictionary with reference to the values stored in activity-code column. So the desired output dataframe should be something like this,
> print(df)
[Out:]
activity-code activity
-------------------------
0 unknown
99 education
84 sports
72;99 social service;education
57 recreational
57;99;11 recreational;education;cultural
11 cultural
This is what I've tried so far,
df['new-activity'] = df['activity-code'].str.split(';').apply(lambda x: ';'.join([act_dict[int(i)] for i in x]))
but I'm getting KeyError for single values where the activity-codes aren't single code values. The error says KeyError: 0
How do i map the dictionary values to the missing values in activity column of dataframe?
Upvotes: 1
Views: 994
Reputation: 2472
Well in case there is no values against 0 in your dictionary you can use filter():
df['activity']= df['activity-code'].apply(lambda x:'; '.join(list(filter(None,map(act_dict.get,list(map(int,x.split(';'))))))))
Upvotes: 0
Reputation: 71610
Use apply
and str.split
, than in apply
, use a list comprehension and join it by ';'
:
df['activity'] = df['activity-code'].str.split(';').apply(lambda x: ';'.join([act_dict[int(i)] for i in x]))
And now:
print(df)
Output:
activity-code activity
0 0 unknown
1 99 education
2 84 sports
3 72;99 social service;education
4 57 recreational
5 57;99;11 recreational;education;cultural
6 11 cultural
Upvotes: 2