Reputation: 153
consider the following yaml configuration file:
items:
item_A:
field_A: some_value
field_B: some_other_value
item_B:
field_A: some_value
field_B: some_other_value
the logical way to represent this is to add dashes in front of each item, making it a list of items:
items:
- item_A:
field_A: some_value
field_B: some_other_value
- item_B:
field_A: some_value
field_B: some_other_value
I want to have an easy access to the names of the objects (item_A etc).
When iterating over the former config,
for item in items:
print(item) # here, 'item' will be the keys in the items dict (eg. 'item_A')
versus the latter
for item in items:
print(item) # here, 'item' will be a full dict (eg. "item_A":{"field_A":"some_value"..)
# if I want access to the name item_A, I need to do item.keys()[0]- not so friendly
I know that the second representation is the logically right one for the situation, but I find it more convenient to use the first representation in order to be able to iterate and get the key/object names directly.
So, my question is:
Is it considered a bad habit to represent a list as a dictionary as in the provided example, in order to achieve easy access to the names/keys of the items?
are there any drawbacks or problems that can occur when doing this?
Upvotes: 1
Views: 71
Reputation: 76578
It is debatable what is a bad habit, and also what is logical. I think
you assume that because the key items
in your root level mapping is
plural, that the value consists of multiple items and should be a
sequence. I don't think that is necessarily true.
However, if the order of the key-value in the mapping that is value
for items
( i.e. the one with the keys item_A
and item_B
) does
matter than you have to use a list, you probably want to do:
items:
- name: item_A
field_A: some_value
field_B: some_other_value
- name: item_B
field_A: some_value
field_B: some_other_value
When you load this into a variable data
, there is no longer easy access to item_B
, just as
with your solution. What you can do after loading is:
data['items'] = Items(data['items'])
with class Items
appropriately providing access to the underlying
datastructure by providing __getitem__
and __iter__
, so that you can do
items = data['items']
for item_name in items:
item = items[item_name]
You can do that without a postprocessing step after loading, by using a tag
items: !Items
- name: item_A
field_A: some_value_1
field_B: some_other_value_1
- name: item_B
field_A: some_value_2
field_B: some_other_value_2
and registering your class Items
. This may however seem not as user friendly as the
version without a tag, although explicit is IMO better than implicit in this case.
Assuming the above is input.yaml
:
import sys
from pathlib import Path
import ruamel.yaml
input = Path('input.yaml')
yaml = ruamel.yaml.YAML()
@yaml.register_class
class Items:
def __init__(self, items):
self.item_list = items
@classmethod
def from_yaml(cls, constructor, node):
return cls(constructor.construct_sequence(node, deep=True))
def __getitem__(self, key):
if isinstance(key, int):
return self.item_list[key]
for item in self.item_list:
if item['name'] == key:
return item
def __iter__(self):
for item in self.item_list:
yield item['name']
data = yaml.load(input)
items = data['items']
print('item1', items[1]['field_A'])
print('item2', items['item_A']['field_A'])
for item_name in items:
item = items[item_name]
print('item3', item['field_B'])
which gives:
item1 some_value_2
item2 some_value_1
item3 some_other_value_1
item3 some_other_value_2
If items
is the only key at the root level of your YAML document,
then of course there is no need to have that key at all, then you
should just put your tag at the start of the document and have a
sequence as the root node.
Upvotes: 1