finlay
finlay

Reputation: 85

How to build path from keys of nested dictionary?

I'm writing a script that broadcasts a number of data streams over an MQTT network. I'm trying to convert the keys of the nested dicts to a string that I can then use as the MQTT broadcast channel. The data is coming in every second already formatted into a nested dict like so:

my_dict = { 'stream1': { 'dataset1': { 'value1': 123.4}},
                         'dataset2': { 'value1': 123.4,
                                       'value2': 567.8},
            'stream2': { 'dataset3': { 'value4': 910.2}},
            'stream3': {               'value5': 'abcd'}}

I've indented it to add readability, the extra spaces aren't in the actual data. As you can see it has multiple levels, not all levels have the same number of values, and some value keys are repeated. Also, one level is shallower than the rest but I can easily make it the same depth as the rest if that makes the problem easier to solve.

The dict above should provide an output like this:

("stream1/dataset1/value1", "stream1/dataset2/value1", ..., "stream3/value5")

and so on.

I feel like recursion might be a good solution to this but I'm not sure how to maintain an ordered list of keys as I pass through the structure, as well as make sure I hit each item in the structure, generating a new path for each base-level item (note the absence of "stream1/dataset1").

Here's the code I have so far:

my_dict = { as defined above }

def get_keys(input_dict, path_list, current_path):
    for key, value in input_dict.items():
        if isinstance(value, dict):
            current_path += value
            get_keys(value, path_list, current_path)
        else:
            path = '/'.join(current_path)
            path_list.append(path)

my_paths = []
cur_path = []
get_keys(my_dict, my_paths, cur_path)
[print(p) for p in my_paths]

Upvotes: 5

Views: 2168

Answers (2)

a_guest
a_guest

Reputation: 36249

You can use a generator for that purpose:

def convert(d):
    for k, v in d.items():
        if isinstance(v, dict):
            yield from (f'{k}/{x}' for x in convert(v))
        else:
            yield k

Considering your expected output you seem to have a misplaced curly brace } in your example data, but using this test data:

my_dict = { 'stream1': { 'dataset1': { 'value1': 123.4},
                         'dataset2': { 'value1': 123.4,
                                       'value2': 567.8}},
            'stream2': { 'dataset3': { 'value4': 910.2}},
            'stream3': {               'value5': 'abcd'}}

This is the output:

print(list(convert(d)))
# ['stream1/dataset1/value1', 'stream1/dataset2/value1', 'stream1/dataset2/value2', 'stream2/dataset3/value4', 'stream3/value5']

Upvotes: 2

John Kugelman
John Kugelman

Reputation: 361595

This is a great opportunity to use yield to turn your function into a generator. A generator can yield a whole bunch of items and behave much like a list or other iterable. The caller loops over its return value and gets one yielded item each iteration until the function returns.

def get_keys(input_dict):
    for key, value in input_dict.items():
        if isinstance(value, dict):
            for subkey in get_keys(value):
                yield key + '/' + subkey
        else:
            yield key

for key in get_keys(my_dict):
    print(key)

Inside the outer for loop each value is either a dict or a plain value. If it's a plain value, just yield the key. If it's a dict, iterate over it and prepend key + '/' to each sub-key.

The nice thing is that you don't have to maintain any state. path_list and current_path are gone. get_keys() simply yields the strings one by one and the yield statements and recursive loop make the flattening of keys naturally shake out.

stream1/dataset1/value1
dataset2/value1
dataset2/value2
stream2/dataset3/value4
stream3/value5

Upvotes: 4

Related Questions