Reputation: 6615
I want to be able to obtain all the various paths to the keys in a JSON file. I often obtain large JSONs and I'm not exactly sure where a various data element might be. Or I need to query various elements of the data. Visualizing a tree of the JSON can be inconvient.
Basically I want to get a list of all the different paths to make various future tasks easier.
For example:
myjson = {'transportation':'car',
'address': {'driveway':'yes','home_address':{'state':'TX',
'city':'Houston'}},
'work_address':{
'state':'TX',
'city':'Sugarland',
'location':'office-tower',
'salary':30000}}
It would be great if I could run some type of loop to get a list back in this format below or in a format....
myjson['address']['driveway']
myjson.address myjson.address.driveway myjson.address.home_address myjson.address.home_address.city myjson.address.home_address.state myjson.transportation myjson.work_address myjson.work_address.city myjson.work_address.location myjson.work_address.salary myjson.work_address.state
For example I've started with
mylist = []
for key, value in myjson.items():
mylist.append(key)
if type(value) is dict:
for key2, value2 in myjson[key].items():
mylist.append(key+'.'+key2)
print(mylist)
I guess this kinda works, but I don't know how to make this iterate indefinitely. For example, how would I build this up to being 3-10+ layers deep?
Upvotes: 0
Views: 1608
Reputation: 11
An implementation handling paths of lists in json also.
import json
def get_json_key_path(jsonStr, enable_index):
json_keys = []
jsonObj = json.loads(jsonStr)
def get_key_path(jsonObj, parent=None):
if not isinstance(json_obj, dict):
return
for key, value in jsonObj.items():
if not isinstance(value, list) and '{}.{}'.format(parent, key) not in json_keys:
json_keys.append('{}.{}'.format(parent, key))
if isinstance(value, dict):
get_key_path(value, parent='{}.{}'.format(parent, key))
elif isinstance(value, list):
i = 0
for obj in value:
if enable_index:
get_key_path(obj, parent='{}.{}.{}'.format(parent, key, i))
else:
get_key_path(obj, parent='{}.{}'.format(parent, key))
i = i + 1
else:
pass
get_key_path(jsonObj, "")
return [ s[1:] for s in json_keys]
Upvotes: 0
Reputation: 639
Great snippet !
Here is a version which manage list:
def get_keys(some_dictionary, parent=None):
if isinstance(some_dictionary, str):
return
for key, value in some_dictionary.items():
if '{}.{}'.format(parent, key) not in my_list:
my_list.append('{}.{}'.format(parent, key))
if isinstance(value, dict):
get_keys(value, parent='{}.{}'.format(parent, key))
if isinstance(value, list):
for v in value:
get_keys(v, parent='{}.{}'.format(parent, key))
else:
pass
Upvotes: 2
Reputation: 5774
I think this should do what you're asking:
myjson = {
'transportation': 'car',
'address': {
'driveway': 'yes',
'home_address': {
'state': 'TX',
'city': 'Houston'}
},
'work_address': {
'state': 'TX',
'city': 'Sugarland',
'location': 'office-tower',
'salary': 30000}
}
def get_keys(some_dictionary, parent=None):
for key, value in some_dictionary.items():
if '{}.{}'.format(parent, key) not in my_list:
my_list.append('{}.{}'.format(parent, key))
if isinstance(value, dict):
get_keys(value, parent='{}.{}'.format(parent, key))
else:
pass
my_list = []
get_keys(myjson, parent='myjson')
print(my_list)
Outputs:
['myjson.transportation',
'myjson.work_address',
'myjson.work_address.city',
'myjson.work_address.state',
'myjson.work_address.location',
'myjson.work_address.salary',
'myjson.address',
'myjson.address.driveway',
'myjson.address.home_address',
'myjson.address.home_address.city',
'myjson.address.home_address.state']
The key is to just keep calling get_keys()
recursively from within the function!
Upvotes: 0