gplt
gplt

Reputation: 65

Create dataframe from nested dictionaries which contain lists

I couldn't find a similar answer to my problem, so here we go:

I have a dict of the following form:

d = {key_1: 
        {
         metric_1: [value_11, value_12], 
         metric_2: [value_13, value_14], 
         metric_3: value_15
       }, 
      key_2: {
         metric_1: [value_21], 
         metric_2: [value_22], 
         metric_3: value_23
       }
    }

As you can see, metrics do not contain the same number of items in their lists.

What would be a good way to convert this into a df? If I use the from_dict method, I end up with df cells that contain lists (bad).

What I am trying to achieve is to have a new row for each one of the values in the lists, keeping the key as index:

index | metric_1 | metric_2 | metric_3
———————————————————————————————————————
key_1 | value_11 | value_13 | value_15
key_1 | value_12 | value_14 | value_15
key_2 | value_21 | value_22 | value_23

Ideas? :)

Upvotes: 2

Views: 52

Answers (2)

Umar.H
Umar.H

Reputation: 23099

if your dataframe has an arbitary length of keys & vals this is one way

data_dict = {k : pd.DataFrame(v) for k,v in d.items()}

df = pd.concat(data_dict .values(),keys=data_dict.keys())

print(df)


         metric_1  metric_2  metric_3
key_1 0  value_11  value_13  value_15
      1  value_12  value_14  value_15
key_2 0  value_21  value_22  value_23

Upvotes: 2

BENY
BENY

Reputation: 323226

Here is one way

s=pd.DataFrame(d).T
s=s.explode('metric_1').assign(metric_2=s.metric_2.explode().values)
       metric_1  metric_2  metric_3
key_1  value_11  value_13  value_15
key_1  value_12  value_14  value_15
key_2  value_21  value_22  value_23
#s.reset_index(inplace=True)

Upvotes: 3

Related Questions