Flattening Nested JSON Dict with Variable Amount of Dictionaries and List Elements

Question

I have an example output JSON in dictionary form:

d = {'httpStatus': 200,
     'httpStatusMessage': 'Success',
     'timestamp': '2022-02-16T19:06:00.1924563Z',
     'response': {'header': {'companyId': '1000',
     'companyName': 'Bobs Groceries'},
     'body': {'dataSources': [{'dataSource': 3,
     'employees': [{'employeeId': '25',
       'employeeReference': '1500',
       'activeFlag': True,
       'createdDate': '2022-01-27T15:20:38.2700000Z',
       'lastUpdate': '2022-01-27T15:20:38.3500000Z',
       'firstName': 'Bob',
       'lastName': 'Brantley'},
      {'employeeId': '28',
       'employeeReference': '1505',
       'activeFlag': True,
       'createdDate': '2022-01-27T15:20:24.2400000Z',
       'lastUpdate': '2022-01-27T15:20:24.2400000Z',
       'firstName': 'Jeffrey',
       'lastName': 'Johnson'}]}]}}}

I would like to recursively parse this JSON and create a master dictionary with the list elements exploded into new dictionary keys with the values as lists. So essentially I want to convert lists of dictionaries into dictionaries of lists.

For the above example I desire the output to be something like:

output = {'httpStatus': 200,
          'httpStatusMessage': 'Success',
          'timestamp': '2022-02-16T19:06:00.1924563Z',
          'response_header_companyId': '1000',
          'companyName': 'Bobs Groceries',
          'dataSource': 3,
          'employeeId': ['25','28'],
          'employeeReference': ['1500','1505'],
          'activeFlag': [True, True],
          'createdDate': ['2022-01-27T15:20:38.2700000Z','2022-01-27T15:20:24.2400000Z'],
          'lastUpdate': ['2022-01-27T15:20:38.3500000Z','2022-01-27T15:20:24.2400000Z'],
          'firstName': ['Bob','Jeffrey'],
          'lastName': ['Brantley','Johnson']}

I can come somewhat close using this function:

def flatten_nested_json(d: dict)-> dict:
    """
    Accepts Dictionary argument which can have nested dictionaries/lists within.
    Output will be a flat dictionary that can be converted to a pandas dataframe
    """
    out = {}
    def flatten(x, name: str=''):
        # handles dictionaries with elements
        if type(x) is dict:
            for k in x:
                flatten(x[k], name + k + "_")
        # handles lists with elements
        elif type(x) is list:
            for j in x:
                if type(j) is dict:
                    for y, z in j.items():
                        out[y] = z
                else:
                    out[j] = x
        else: 
            out[name[:-1]] = x
    flatten(d)
    return out

However, the list of dictionaries with employeeId, etc. remains as a dictionary. I need to find a way to add the recursion for that part so I can explode the dictionaries into new keys with the values as lists. Basically I want to combine all of those dictionaries with similar keys in a list into a single dictionary with list elements containing the values of each dictionary.

A dynamic approach without pandas would be desired. Thanks!

Syntax	Meaning
`$`	The root object
`jsonpath1 .. jsonpath2`	All nodes matched by jsonpath2 that descend from any node matching jsonpath1
`*`	any field

Flattening Nested JSON Dict with Variable Amount of Dictionaries and List Elements

Answers (1)

Code:

Explanation:

Output:

[EDIT]

Code (A full path as a key):

Output (A full path as a key):

Related Questions