Reputation: 1231
I am creating a configuration file for my application. To do it, I decided to use YAML for its simplicity and reliability.
I am currently designing a special part of my application: In this part, I have to list and configure all datasets I want to use in a module. To do that I wrote this :
// Other stuff
datasets:
rate_variation:
name: Rate variation over time # Optional
description: Description here # Optional
type: POINTS_2D
options:
REFRESH_TIME: 5 # Time of refresh in second
frequency_variation:
name: Frequency variation over time
description: Description here # Optional
type: POINTS_2D
But, after some reflection, I have some doubts about it. Because maybe something like this is better :
datasets:
- id: rate_variation
name: Rate variation over time # Optional
description: Description here # Optional
type: POINTS_2D
options:
REFRESH_TIME: 5 # Time of refresh in second
- id: frequency_variation
name: Frequency variation over time
description: Description here # Optional
type: POINTS_2D
I use the ID to identify each dataset in my scripts (two datasets must have a different id) and generate output files for each of them. But now, I really don't know what is the best solution...
What would you recommend to use? And for what reason?
Upvotes: 8
Views: 13629
Reputation: 32360
ddconfig
formatddconfig
format
ddconfig
)Scenario: Developer graille_stentiplub is creating a configuration file format for use with YAML.
Special considerations: graille_stentiplub wants an easy way to determine when to use lists vs mappings.
the following is a simple config file using YAML ddconfig
format
dataroot:
file_metadata_str: |
### <beg-block>
### - caption: "my first project"
### notes: |
### * href="//home/sm/docs/workup/my_first_project.txt"
### <end-block>
project_info:
prj_name_nice: StackOverflow Demo Answer Project
prj_name_mach: stackoverflow_demo_001a
prj_sponsor_url: https://stackoverflow.com/questions/54349286
prj_dept_url: https://demo-university.edu/dept/basketweaving
dataset_recipient_list:
- [email protected]
- [email protected]
- [email protected]
dataset_variations_table:
- dvar_id: rate_variation
dvar_name: Rate variation over time # Optional
dvar_description: Description here # Optional
dvar_type: POINTS_2D
dvar_opt_refresh_per_second: 5 # Time in seconds
- dvar_id: frequency_variation
dvar_name: Frequency variation over time
dvar_description: Description here # Optional
dvar_type: POINTS_2D
The entire data structure is nested under a toplevel key called dataroot
(this is optional).
dataroot
key makes the YAML structure more addressible but is not necessary.dataroot
as a root-level directory.The entire data structure consists of a YAML mapping (aka dictionay) (aka associative-array).
dataroot
(or else a toplevel key if dataroot is omitted).There are different types of mapping keys:
_str
) indicates that the mapped value is a string (aka scalar) value._list
) indicates the mapped value is a list (aka sequence)._info
) indicates the mapped value is mapping (aka dictionary) (aka associative-array)._table
) indicates the mapped value is a sequence-of-mappings (aka table)._tree
or _struct
) indicates a composite structure with support for one or more nested parent-child relationships.ddconfig
format coincides nicely with many different contexts and tools._list
mapping consists of a sequence of scalar-value items with no nesting._info
mapping consists of a scalar-key and a scalar-value (name-value pairs) with no nesting._table
mapping is simply a sequence of _info
mappings._tree
composite data structure.ddconfig
_info
mapping as a single record from a standard table in a relational database.ddconfig
_table
mapping as a standard table in a relational database.ddconfig
format works well with YAML anchors and aliases._info
mappings can be easily converted to a _table
mapping by way of aliases._info
mappings can be combined together into another _info
mapping by way of YAML merge keys.Upvotes: 7
Reputation: 39638
With the first option, YAML enforces that there are no duplicate IDs. Therefore, an editor supporting YAML may support your user by showing an error in this case. With the second option, you need to check uniqueness in your code and the user only sees the error when loading the syntactically correct YAML into your application.
However, there are other factors to consider. For example, you may have a preference for the resulting in-memory data structures. If you use standard YAML implementations that deserialize to native data structures (PyYAML, SnakeYAML etc), the YAML structure imposes the type of the in-memory data structure (you can customize by writing custom constructors, but that's not trivial). For example, if you want to ask a dataset object for its ID, that is only directly doable with the second structure – if you use the first structure, you would need to search the parent table for the dataset value you have to get its ID.
So, final answer is (as always): It depends. Think about what you want to do with it. For simple configuration files, my second argument may be weaker than my first one, but I don't know what exactly you want to do with the data.
Upvotes: 6