Reputation: 103
I have a dataframe(df) that looks like this:
a b c
1 35-23 RB {'1': '1', '13': '15', '14': '30', '20': '3'}
2 24-10 RB {'1': '1', '13': '17', '14': '86'}
3 26-30 RB {'1': '1', '13': '14', '14': '54'}
4 14-7 RB []
5 24-27 RB {'1': '1', '13': '20', '14': '94', '15': '1'}
6 38-52 RB {'1': '1', '70': '3', '72': '1', '84': '1'}
7 False RB {'1': '1', '70': '4', '71': '1', '72': '0.5'}
8 15-20 RB []
9 30-17 RB []
And I have another dictionary that gives the human readable names (approx. 100 key, value pairs) to the dictionary keys in column c. It looks something like this:
dict_names = {1: 'Test', 13: 'Ind Score', 14: 'Successful', 15: 'Cofactor', 20: 'Attempts', etc ...}
What I would like to do is map my dict_names values to the keys in my list of dicts in column c of my df, so I have human readable names. I would then convert the keys for the list of dicts (column c), the human readable names, to column names in my df.
I know I can convert the dicts in column c to different columns by doing something like this:
df['c'] = df['c'].apply(lambda x : dict(eval(x)))
df_stats = df['c'].apply(pd.Series)
Then I can concat the df_stats with the df and drop the old column c. Voila!
But I am stuck with mapping the dict_names values with the keys of the list of dicts in column c of my df.
The final df output I am looking for would be something like this:
a b Test Ind Score Successful Cofactor Attempts ....
1 35-23 RB 1 15 30 NaN 3
2 24-10 RB 1 17 86 NaN NaN
3 26-30 RB 1 17 86 NaN NaN
4 14-7 RB NaN NaN NaN NaN NaN
5 24-27 RB 1 20 94 1 NaN
6 38-52 RB 1 NaN NaN NaN NaN
7 False RB 1 NaN NaN NaN NaN
8 15-20 RB NaN NaN NaN NaN NaN
9 30-17 RB NaN NaN NaN NaN NaN
Help finding my way to my final output would be appreciated.
Upvotes: 0
Views: 152
Reputation: 323226
You can using rename
and pass a dict
to replace the columns; name
from ast import literal_eval
df['c'] = df['c'].apply(literal_eval)
#dd={'1': 'Test', '13': 'Ind Score', '14': 'Successful', '15': 'Cofactor', '20': 'Attempts'}
dd= {str(k):v for k,v in dict_names.items()}#change
pd.concat([df.drop('c',1),df['c'].apply(pd.Series).rename(columns=dd)],1)
Out[1320]:
a b Test Ind Score Successful Attempts
0 35-23 BR 1 15 30 3
1 24-10 BR 1 17 86 NaN
2 26-30 BR 1 14 54 NaN
Upvotes: 1