Reputation: 85
I'm writing a script that broadcasts a number of data streams over an MQTT network. I'm trying to convert the keys of the nested dicts to a string that I can then use as the MQTT broadcast channel. The data is coming in every second already formatted into a nested dict like so:
my_dict = { 'stream1': { 'dataset1': { 'value1': 123.4}},
'dataset2': { 'value1': 123.4,
'value2': 567.8},
'stream2': { 'dataset3': { 'value4': 910.2}},
'stream3': { 'value5': 'abcd'}}
I've indented it to add readability, the extra spaces aren't in the actual data. As you can see it has multiple levels, not all levels have the same number of values, and some value keys are repeated. Also, one level is shallower than the rest but I can easily make it the same depth as the rest if that makes the problem easier to solve.
The dict above should provide an output like this:
("stream1/dataset1/value1", "stream1/dataset2/value1", ..., "stream3/value5")
and so on.
I feel like recursion might be a good solution to this but I'm not sure how to maintain an ordered list of keys as I pass through the structure, as well as make sure I hit each item in the structure, generating a new path for each base-level item (note the absence of "stream1/dataset1").
Here's the code I have so far:
my_dict = { as defined above }
def get_keys(input_dict, path_list, current_path):
for key, value in input_dict.items():
if isinstance(value, dict):
current_path += value
get_keys(value, path_list, current_path)
else:
path = '/'.join(current_path)
path_list.append(path)
my_paths = []
cur_path = []
get_keys(my_dict, my_paths, cur_path)
[print(p) for p in my_paths]
Upvotes: 5
Views: 2168
Reputation: 36249
You can use a generator for that purpose:
def convert(d):
for k, v in d.items():
if isinstance(v, dict):
yield from (f'{k}/{x}' for x in convert(v))
else:
yield k
Considering your expected output you seem to have a misplaced curly brace }
in your example data, but using this test data:
my_dict = { 'stream1': { 'dataset1': { 'value1': 123.4},
'dataset2': { 'value1': 123.4,
'value2': 567.8}},
'stream2': { 'dataset3': { 'value4': 910.2}},
'stream3': { 'value5': 'abcd'}}
This is the output:
print(list(convert(d)))
# ['stream1/dataset1/value1', 'stream1/dataset2/value1', 'stream1/dataset2/value2', 'stream2/dataset3/value4', 'stream3/value5']
Upvotes: 2
Reputation: 361595
This is a great opportunity to use yield
to turn your function into a generator. A generator can yield a whole bunch of items and behave much like a list or other iterable. The caller loops over its return value and gets one yielded item each iteration until the function returns.
def get_keys(input_dict):
for key, value in input_dict.items():
if isinstance(value, dict):
for subkey in get_keys(value):
yield key + '/' + subkey
else:
yield key
for key in get_keys(my_dict):
print(key)
Inside the outer for
loop each value is either a dict or a plain value. If it's a plain value, just yield the key. If it's a dict, iterate over it and prepend key + '/'
to each sub-key.
The nice thing is that you don't have to maintain any state. path_list
and current_path
are gone. get_keys()
simply yields the strings one by one and the yield
statements and recursive loop make the flattening of keys naturally shake out.
stream1/dataset1/value1
dataset2/value1
dataset2/value2
stream2/dataset3/value4
stream3/value5
Upvotes: 4