runningbirds
runningbirds

Reputation: 6615

Obtaining all key path combinations in a json/dictionary in python

I want to be able to obtain all the various paths to the keys in a JSON file. I often obtain large JSONs and I'm not exactly sure where a various data element might be. Or I need to query various elements of the data. Visualizing a tree of the JSON can be inconvient.

Basically I want to get a list of all the different paths to make various future tasks easier.

For example:

myjson = {'transportation':'car',
'address': {'driveway':'yes','home_address':{'state':'TX',
'city':'Houston'}},
 'work_address':{
'state':'TX',
'city':'Sugarland',
 'location':'office-tower',
 'salary':30000}}

It would be great if I could run some type of loop to get a list back in this format below or in a format....

myjson['address']['driveway']

myjson.address myjson.address.driveway myjson.address.home_address myjson.address.home_address.city myjson.address.home_address.state myjson.transportation myjson.work_address myjson.work_address.city myjson.work_address.location myjson.work_address.salary myjson.work_address.state

For example I've started with

mylist = []

for  key, value in myjson.items():
    mylist.append(key)
    if type(value) is dict:
        for key2, value2 in myjson[key].items():
        mylist.append(key+'.'+key2)
print(mylist)

I guess this kinda works, but I don't know how to make this iterate indefinitely. For example, how would I build this up to being 3-10+ layers deep?

Upvotes: 0

Views: 1608

Answers (3)

Kenneth Sebastian
Kenneth Sebastian

Reputation: 11

An implementation handling paths of lists in json also.

import json
def get_json_key_path(jsonStr, enable_index):
    json_keys = []
    jsonObj = json.loads(jsonStr)
    
    def get_key_path(jsonObj, parent=None):
        if not isinstance(json_obj, dict):
            return
        for key, value in jsonObj.items():
            if not isinstance(value, list) and '{}.{}'.format(parent, key) not in json_keys:
                json_keys.append('{}.{}'.format(parent, key))
            if isinstance(value, dict):
                get_key_path(value, parent='{}.{}'.format(parent, key))
            elif isinstance(value, list):
                i = 0
                for obj in value:
                    if enable_index:
                        get_key_path(obj, parent='{}.{}.{}'.format(parent, key, i))
                    else:
                        get_key_path(obj, parent='{}.{}'.format(parent, key))
                    i = i + 1
            else:
                pass

    get_key_path(jsonObj, "")
    return [ s[1:] for s in json_keys]

Upvotes: 0

Fabien Vauchelles
Fabien Vauchelles

Reputation: 639

Great snippet !

Here is a version which manage list:

def get_keys(some_dictionary, parent=None):
    if isinstance(some_dictionary, str):
        return
    for key, value in some_dictionary.items():
        if '{}.{}'.format(parent, key) not in my_list:
            my_list.append('{}.{}'.format(parent, key))
        if isinstance(value, dict):
            get_keys(value, parent='{}.{}'.format(parent, key))
        if isinstance(value, list):
            for v in value:
                get_keys(v, parent='{}.{}'.format(parent, key))
        else:
            pass

Upvotes: 2

Reedinationer
Reedinationer

Reputation: 5774

I think this should do what you're asking:

myjson = {
    'transportation': 'car',
    'address': {
        'driveway': 'yes',
        'home_address': {
            'state': 'TX',
            'city': 'Houston'}
    },
    'work_address': {
        'state': 'TX',
        'city': 'Sugarland',
        'location': 'office-tower',
        'salary': 30000}
}


def get_keys(some_dictionary, parent=None):
    for key, value in some_dictionary.items():
        if '{}.{}'.format(parent, key) not in my_list:
            my_list.append('{}.{}'.format(parent, key))
        if isinstance(value, dict):
            get_keys(value, parent='{}.{}'.format(parent, key))
        else:
            pass


my_list = []
get_keys(myjson, parent='myjson')
print(my_list)

Outputs:

['myjson.transportation',
'myjson.work_address',
'myjson.work_address.city',
'myjson.work_address.state',
'myjson.work_address.location',
'myjson.work_address.salary',
'myjson.address',
'myjson.address.driveway',
'myjson.address.home_address',
'myjson.address.home_address.city',
'myjson.address.home_address.state']

The key is to just keep calling get_keys() recursively from within the function!

Upvotes: 0

Related Questions