agustin
agustin

Reputation: 1351

Walk in dict to find desired keys

I am able to recursively obtain the values of all keys in a particular node where the key is extras:

def findkeys(node, kv):
    if isinstance(node, list):
        for i in node:
            for x in findkeys(i, kv):
                yield x
    elif isinstance(node, dict):
        if kv in node:
            yield node[kv]
        for j in node.values():
            for x in findkeys(j, kv):
                yield x 

With following input:

a = {'product': {'extras': {'size': 'large', 'color': 'green', 'name':'shirt'}, 'cat': 'male', 'season': 'summer'}, 'id': 'a12b', 'brand': 'aua'}
print(list(findkeys(a, 'extras')))

the output is as desired:

[{'size': 'large', 'color': 'green', 'name': 'shirt'}]

However, how can I change my function to, additionally, capture cat and id? Note that I only want to capture the sibling key cat of extras and the parent key id of extras. For me, an optimal output should be:

[{'size': 'large', 'color': 'green', 'name': 'shirt', 'cat': 'male', 'id': 'a12b' }]

Also note that product may not be present in the dict. This is the reason why I need to find extras first (which is always present) and search in their siblings and parents

As suggested in the comments, please find attached a complete dictionary with possible cases:

{  
   "contents":[  
      {  
         "product":{  
            "extras":{  
               "size":"large",
               "color":"green",
               "name":"shirt"
            },
            "cat":"male"
         },
         "id":"a12b"
      },
      {  
         "products":{  
            "extras":{  
               "size":"small",
               "color":"red",
               "name":"trouser"
            },
            "cat":"male",
            "price":12.21
         },
         "id":"a23b"
      },
      {  
         "produkt":{  
            "extras":{  
               "size":"medium",
               "color":"yellow",
               "name":"hat"
            },
            "cat":"female",
            "price":2.87,
            "units":100
         },
         "id":"a34b"
      }
   ]
}

Please note that I cannot just use ['product'] to navigate in the objects, as product may not be present (other variations may appear). That comes this way from the data source. My desired output:

[{'size': 'large', 'color': 'green', 'name': 'shirt', 'cat': 'male', 'id': 'a12b' },
{'size': 'small', 'color': 'red', 'name': 'trouser', 'cat': 'male', 'id': 'a23b' },
{'size': 'medium', 'color': 'yellow','name': 'hat', 'cat': 'female', 'id': 'a34b' }]

Upvotes: 0

Views: 223

Answers (2)

Henry Yik
Henry Yik

Reputation: 22503

Not thoroughly tested, just base on your current function and added a mutable argument to store the values:

def findkeys(node, kv, data={}):
    if isinstance(node, list):
        for i in node:
            for x in findkeys(i, kv):
                yield x
    elif isinstance(node, dict):
        id = node.get("id")
        if id:
            data["id"] = id
        if kv in node:
            data["cat"] = node.get("cat")
            data.update(node[kv])
            yield data
            data.clear()
        for j in node.values():
            for x in findkeys(j, kv):
                yield x

for i in findkeys(b["contents"], 'extras'):
    print (i)

Result:

{'id': 'a12b', 'cat': 'male', 'size': 'large', 'color': 'green', 'name': 'shirt'}
{'id': 'a23b', 'cat': 'male', 'size': 'small', 'color': 'red', 'name': 'trouser'}
{'id': 'a34b', 'cat': 'female', 'size': 'medium', 'color': 'yellow', 'name': 'hat'}

Upvotes: 1

Alain T.
Alain T.

Reputation: 42133

'cat' is a sibling of the item you're looking for but 'id' is an "uncle". Your recursive function would need pass itself the "path" of sub-dictionaries that it went through so that you can backtrack to the parent and grand parent of the key you find.

Upvotes: 0

Related Questions