Vivek Harikrishnan
Vivek Harikrishnan

Reputation: 866

Pandas column dict split to new column and rows

I have a dict in pandas dataframe column, the input is,

import pandas as pd
df = pd.DataFrame([{'A': {'k1': 10}}, {'A': {'k2': 20, 'k3': 30}}, {'A': {'k4': 15}}])

df
                        A
0             {u'k1': 10}
1  {u'k3': 30, u'k2': 20}
2             {u'k4': 15}

I would like to split the keys and values of dict in column 'A' to new columns and split to rows(depends on number of keys in the dict), basically output should look like,

  keys  values
0   k1      10
1   k3      30
2   k2      20
3   k4      15

Upvotes: 1

Views: 902

Answers (2)

user3483203
user3483203

Reputation: 51175

Option 1 (If you have all unique keys in sub-dictionaries)
dict with collections.ChainMap

from collections import ChainMap   
dct = dict(ChainMap(*[i['A'] for i in d]))
pd.DataFrame(list(dct.items()), columns=['key', 'value'])

  key  value
0  k1     10
1  k4     15
2  k2     20
3  k3     30

Option 2 (If you might have duplicate keys)
itertools.chain.from_iterable

dct = list(itertools.chain.from_iterable([i['A'].items() for i in d]))
df = pd.DataFrame(dct, columns=['key', 'value'])

  key  value
0  k1     10
1  k2     20
2  k3     30
3  k4     15

Upvotes: 1

jezrael
jezrael

Reputation: 862591

Use list comprehension with flatenning for tuples and then DataFrame contructor:

L = [(k1, v1) for k, v in df['A'].to_dict().items() for k1, v1 in v.items()]

df = pd.DataFrame(L, columns = ['keys','values'])
print (df)
  keys  values
0   k1      10
1   k2      20
2   k3      30
3   k4      15

Or create DataFrame and stack:

df = (pd.DataFrame(df['A'].values.tolist())
       .stack().reset_index(level=0, drop=True)
       .reset_index())
df.columns = ['keys','values']
print (df)
  keys  values
0   k1    10.0
1   k2    20.0
2   k3    30.0
3   k4    15.0

Upvotes: 2

Related Questions