tgordon18
tgordon18

Reputation: 1859

Pandas MultiIndex (more than 2 levels) DataFrame to Nested Dict/JSON

This question is similar to this one, but I want to take it a step further. Is it possible to extend the solution to work with more levels? Multilevel dataframes' .to_dict() method has some promising options, but most of them will return entries that are indexed by tuples (i.e. (A, 0, 0): 274.0) rather than nesting them in dictionaries.

For an example of what I'm looking to accomplish, consider this multiindex dataframe:

data = {0: {
        ('A', 0, 0): 274.0, 
        ('A', 0, 1): 19.0, 
        ('A', 1, 0): 67.0, 
        ('A', 1, 1): 12.0, 
        ('B', 0, 0): 83.0, 
        ('B', 0, 1): 45.0
    },
    1: {
        ('A', 0, 0): 254.0, 
        ('A', 0, 1): 11.0, 
        ('A', 1, 0): 58.0, 
        ('A', 1, 1): 11.0, 
        ('B', 0, 0): 76.0, 
        ('B', 0, 1): 56.0
    }   
}
df = pd.DataFrame(data).T
df.index = ['entry1', 'entry2']
df
# output:

         A                              B
         0              1               0
         0      1       0       1       0       1
entry1   274.0  19.0    67.0    12.0    83.0    45.0
entry2   254.0  11.0    58.0    11.0    76.0    56.0

You can imagine that we have many records here, not just two, and that the index names could be longer strings. How could you turn this into nested dictionaries (or directly to JSON) that look like this:

[
 {'entry1': {'A': {0: {0: 274.0, 1: 19.0}, 1: {0: 67.0, 1: 12.0}},
  'B': {0: {0: 83.0, 1: 45.0}}},
 'entry2': {'A': {0: {0: 254.0, 1: 11.0}, 1: {0: 58.0, 1: 11.0}},
  'B': {0: {0: 76.0, 1: 56.0}}}}
]

I'm thinking some amount of recursion could potentially be helpful, maybe something like this, but have so far been unsuccessful.

Upvotes: 12

Views: 7466

Answers (2)

highlander
highlander

Reputation: 9

I took the idea from the previous answer and slightly modified it.

1) Took the function nested_dict from stackoverflow, to create the dictionary

from collections import defaultdict
def nested_dict(n, type):
    if n == 1:
        return defaultdict(type)
    else:
        return defaultdict(lambda: nested_dict(n-1, type))

2 Wrote the following function:

def df_to_nested_dict(self, df, type):

    # Get the number of levels
    temp = df.index.names
    lvl = len(temp)

    # Create the target dictionary
    new_nested_dict=nested_dict(lvl, type)
    # Convert the dataframe to a dictionary
    temp_dict = df.to_dict(orient='index')
    for x, y in temp_dict.items():
        dict_keys = ''
        # Process the individual items from the key
        for item in x:
            dkey = '[%d]' % item
            dict_keys = dict_keys + dkey

        # Create a string and execute it
        dict_update = 'new_nested_dict%s = y' % dict_keys
        exec(dict_update)

    return new_nested_dict

It is the same idea but it is done slightly different

Upvotes: 0

Brad Solomon
Brad Solomon

Reputation: 40888

So, you really need to do 2 things here:

  • df.to_dict()
  • Convert this to nested dictionary.

df.to_dict(orient='index') gives you a dictionary with the index as keys; it looks like this:

>>> df.to_dict(orient='index')
{'entry1': {('A', 0, 0): 274.0,
  ('A', 0, 1): 19.0,
  ('A', 1, 0): 67.0,
  ('A', 1, 1): 12.0,
  ('B', 0, 0): 83.0,
  ('B', 0, 1): 45.0},
 'entry2': {('A', 0, 0): 254.0,
  ('A', 0, 1): 11.0,
  ('A', 1, 0): 58.0,
  ('A', 1, 1): 11.0,
  ('B', 0, 0): 76.0,
  ('B', 0, 1): 56.0}}

Now you need to nest this. Here's a trick from Martijn Pieters to do that:

def nest(d: dict) -> dict:
    result = {}
    for key, value in d.items():
        target = result
        for k in key[:-1]:  # traverse all keys but the last
            target = target.setdefault(k, {})
        target[key[-1]] = value
    return result

Putting this all together:

def df_to_nested_dict(df: pd.DataFrame) -> dict:
    d = df.to_dict(orient='index')
    return {k: nest(v) for k, v in d.items()}

Output:

>>> df_to_nested_dict(df)
{'entry1': {'A': {0: {0: 274.0, 1: 19.0}, 1: {0: 67.0, 1: 12.0}},
  'B': {0: {0: 83.0, 1: 45.0}}},
 'entry2': {'A': {0: {0: 254.0, 1: 11.0}, 1: {0: 58.0, 1: 11.0}},
  'B': {0: {0: 76.0, 1: 56.0}}}}

Upvotes: 17

Related Questions