Roy Cohen
Roy Cohen

Reputation: 153

is using a dictionary instead of a list in a configuration file a bad habit?

consider the following yaml configuration file:

items:
    item_A:
      field_A: some_value
      field_B: some_other_value
    item_B:
      field_A: some_value
      field_B: some_other_value

the logical way to represent this is to add dashes in front of each item, making it a list of items:

items:
    - item_A:
        field_A: some_value
        field_B: some_other_value
    - item_B:
        field_A: some_value
        field_B: some_other_value

I want to have an easy access to the names of the objects (item_A etc).

When iterating over the former config,

for item in items:
    print(item) # here, 'item' will be the keys in the items dict (eg. 'item_A')

versus the latter

for item in items:
    print(item) # here, 'item' will be a full dict (eg. "item_A":{"field_A":"some_value"..)
    # if I want access to the name item_A, I need to do item.keys()[0]- not so friendly

I know that the second representation is the logically right one for the situation, but I find it more convenient to use the first representation in order to be able to iterate and get the key/object names directly.

So, my question is:

Is it considered a bad habit to represent a list as a dictionary as in the provided example, in order to achieve easy access to the names/keys of the items?

are there any drawbacks or problems that can occur when doing this?

Upvotes: 1

Views: 71

Answers (1)

Anthon
Anthon

Reputation: 76578

It is debatable what is a bad habit, and also what is logical. I think you assume that because the key items in your root level mapping is plural, that the value consists of multiple items and should be a sequence. I don't think that is necessarily true.

However, if the order of the key-value in the mapping that is value for items ( i.e. the one with the keys item_A and item_B) does matter than you have to use a list, you probably want to do:

items:
- name: item_A
  field_A: some_value
  field_B: some_other_value
- name: item_B
  field_A: some_value
  field_B: some_other_value

When you load this into a variable data, there is no longer easy access to item_B, just as with your solution. What you can do after loading is:

data['items'] = Items(data['items'])

with class Items appropriately providing access to the underlying datastructure by providing __getitem__ and __iter__, so that you can do

items = data['items']
for item_name in items:
   item = items[item_name]

You can do that without a postprocessing step after loading, by using a tag

items: !Items
- name: item_A
  field_A: some_value_1
  field_B: some_other_value_1
- name: item_B
  field_A: some_value_2
  field_B: some_other_value_2

and registering your class Items. This may however seem not as user friendly as the version without a tag, although explicit is IMO better than implicit in this case.

Assuming the above is input.yaml:

import sys
from pathlib import Path
import ruamel.yaml

input = Path('input.yaml')

yaml = ruamel.yaml.YAML()

@yaml.register_class
class Items:
    def __init__(self, items):
        self.item_list = items

    @classmethod
    def from_yaml(cls, constructor, node):
        return cls(constructor.construct_sequence(node, deep=True))

    def __getitem__(self, key):
        if isinstance(key, int):
            return self.item_list[key]
        for item in self.item_list:
            if item['name'] == key:
                return item

    def __iter__(self):
        for item in self.item_list:
             yield item['name']

data = yaml.load(input)
items = data['items']
print('item1', items[1]['field_A'])
print('item2', items['item_A']['field_A'])
for item_name in items:
    item = items[item_name]
    print('item3', item['field_B'])

which gives:

item1 some_value_2
item2 some_value_1
item3 some_other_value_1
item3 some_other_value_2

If items is the only key at the root level of your YAML document, then of course there is no need to have that key at all, then you should just put your tag at the start of the document and have a sequence as the root node.

Upvotes: 1

Related Questions