Liam Hanninen
Liam Hanninen

Reputation: 1573

Is there a way fully traverse a python dictionary that has unpredictable length but predictable structure?

Consider the following dictionary:

{
    "Key1":"value1",
    "Key2": []
}

Accessing the value for Key1 is trivial: dict_name['Key1']. Now consider it with one more level:

{
    "Key1":"valueA",
    "Key2": [
        {
            "Key1":"valueB",
            "Key2":[]
        },
        {
            "Key1":"valueC",
            "Key2":[]
        }
    ]
}

My goal is to get a list of all Key1 values. So for this dictionary I can do this:

values_list = [dictionary_name['Key1']]
additional_values = [child['Key1'] for child in dictionary_name['Key2']]
values_list.extend(additional_values)
print(values_list)

Out: ['valueA', 'valueB', 'valueC']

Now consider if you didn't know how many descendants any Key2 might have. But you do know that any/all nodes will be formatted the same:

{
    "Key1":"value1",
    "Key2": [some or none child nodes]
}

So my question is: Is there a way to build a list of all possible Key1 values?

My current, messy attempt only get me to the second level.

values_list = []
for first_level in first_levels:
    values_list.append(first_level['Key1'])
    next_levels = first_level.get('Key2', [])
    next_levels_len = len(next_levels)
    while next_levels_len > 0:
        next_levels_len = 0
        for next_level in next_levels:
            values_list.append(next_level['Key1'])
            next_levels = next_level.get('Key2', [])
            next_levels_len += len(next_levels)

Upvotes: 1

Views: 74

Answers (2)

Davinder Singh
Davinder Singh

Reputation: 2162

A nice logical answer is given by BrenBarn but for simple you can using Regex-Expression to extract out such pattern. Before using this please look at string representation of dict-object. For example:

>>>d = {"key1":"valueA"}
>>>str(d)
{'key1': 'valueA'}

So here few points to note:
1. Double Quotes(") is replace by Single Quotes (').
2. Space after colon.
Basically these are coding convention used in python but we ignore while coding.

dictionary_name ={
    "Key1":"valueA",
    "Key2": [
        {
            "Key1":"valueB",
            "Key2":[]
        },
        {
            "Key1":"valueC",
            "Key2":[]
        }
    ]
}

str_ = str(dictionary_name)
import re
regobt = re.compile(r"'Key1': '\w*'")
list_ = regobt.findall(str_)
print(list_)

OUTPUT:

["'Key1': 'valueA'", "'Key1': 'valueB'", "'Key1': 'valueC'"]

Extracting Your Data from Output :

Extract your requirement from here by various method like again using `regular-expression` or just `string-indexing`.

Instead of doing thing use this extra code .replace("'", '').replace(':','') at str_ and update your regex object with this re.compile(r"Key1 \w*").

Now output will be :

['Key1 valueA', 'Key1 valueB', 'Key1 valueC']

Upvotes: 1

BrenBarn
BrenBarn

Reputation: 251373

Here's a simple version:

def get_key1(d):
    vals = [d['Key1']]
    for subd in d['Key2']:
        vals += get_key1(subd)
    return vals

Then use it by doing get_key1(my_dict).

The idea is that you want to put your logic into a function, and have that function call itself for each nested dictionary, then add the returned values to your list.

Upvotes: 2

Related Questions