Reputation: 866
I have a dict in pandas dataframe column, the input is,
import pandas as pd
df = pd.DataFrame([{'A': {'k1': 10}}, {'A': {'k2': 20, 'k3': 30}}, {'A': {'k4': 15}}])
df
A
0 {u'k1': 10}
1 {u'k3': 30, u'k2': 20}
2 {u'k4': 15}
I would like to split the keys and values of dict
in column 'A' to new columns and split to rows(depends on number of keys in the dict), basically output should look like,
keys values
0 k1 10
1 k3 30
2 k2 20
3 k4 15
Upvotes: 1
Views: 902
Reputation: 51175
Option 1 (If you have all unique keys in sub-dictionaries)
dict
with collections.ChainMap
from collections import ChainMap
dct = dict(ChainMap(*[i['A'] for i in d]))
pd.DataFrame(list(dct.items()), columns=['key', 'value'])
key value
0 k1 10
1 k4 15
2 k2 20
3 k3 30
Option 2 (If you might have duplicate keys)
itertools.chain.from_iterable
dct = list(itertools.chain.from_iterable([i['A'].items() for i in d]))
df = pd.DataFrame(dct, columns=['key', 'value'])
key value
0 k1 10
1 k2 20
2 k3 30
3 k4 15
Upvotes: 1
Reputation: 862591
Use list comprehension with flatenning for tuples and then DataFrame
contructor:
L = [(k1, v1) for k, v in df['A'].to_dict().items() for k1, v1 in v.items()]
df = pd.DataFrame(L, columns = ['keys','values'])
print (df)
keys values
0 k1 10
1 k2 20
2 k3 30
3 k4 15
Or create DataFrame
and stack
:
df = (pd.DataFrame(df['A'].values.tolist())
.stack().reset_index(level=0, drop=True)
.reset_index())
df.columns = ['keys','values']
print (df)
keys values
0 k1 10.0
1 k2 20.0
2 k3 30.0
3 k4 15.0
Upvotes: 2