El Guapo
El Guapo

Reputation: 5781

How to Reference an aliased map value in YAML

I have a feeling this isn't possible, but I have a snippet of YAML that looks like the following:

.map_values: &my_map
  a: 'D'
  b: 'E'
  a: 'F'

section:
  stage: *my_map['b']

I would like stage to have the value of E.

Is this possible within YAML? I've tried just about every incarnation of substitution I can think of.

Upvotes: 1

Views: 2182

Answers (1)

Anthon
Anthon

Reputation: 76598

Since there is a duplicate key in your mapping, which is not allowed in YAML 1.2 (and should at least throw a warning in YAML 1.1) this is not going to work, but even if you correct that, you can't do that with just anchors and aliases.

The only substitution like replacement that is available in YAML is the "Merge Key Language-Independent Type". That is indirectly referenced in the YAML spec, and not included in it, but available in most parsers.

The only thing that allows it to do is "update" a mapping with key value pairs of one or more other mappings, if the key doesn't already exist in the mapping. You use the special key << for that, which takes an alias, or a list of aliases.

There is no facility, specified in the YAML specification, to dereference particular keys.

There are some systems that use templates that generate YAML, but there are two main problems to apply these here:

  • the template languages themselves often are clashing with the indicators in the YAML syntax, making the template not valid YAML

  • even if the template could be loaded as valid YAML, and the values extracted that are needed to update other parts of the template, you would need to parse the input twice (once to get the values to update the template, then to parse the updated template). Given the potential complexity of YAML and the relative slow speed of its parsers, this can be prohibitive

What you can do is create some tag (e.g. !lookup) and have its constructor interpret that node. Since the node has to be valid YAML again you have to decide on whether to use a sequence or a mapping. You'll have to include some special syntax for the values in both cases, and also for the key (like the << used in merges) in the case of mappings.

In the examples I left out the spurious single quotes, depending on your real values you might of course need them.

Example using sequence :

.map_values: &my_map
  a: D
  b: E
  c: F

section: !Lookup
- *my_map
- stage: <b>

Example using mapping:

.map_values: &my_map
  a: D
  b: E
  c: F

section: !Lookup
  <<: *my_map
  stage: <b>

Both can be made to construct the data on the fly (i.e. no past loading processing of your data structure necessary). E.g. using Python and the sequence "style" in input.yaml:

import sys
import ruamel.yaml
from pathlib import Path

input = Path('input.yaml')

yaml = ruamel.yaml.YAML(typ='safe')
yaml.default_flow_style = False

@yaml.register_class
class Lookup:
    @classmethod
    def from_yaml(cls, constructor, node):
         """
            this expects a two entry sequence, in which the first is a mapping X, typically using
            an alias
            the second entry should be an mapping, for which the values which have the form <key>
            are looked up in X
            non-existing keys will throw an error during loading.
         """
         X, res = constructor.construct_sequence(node, deep=True)
         yield res
         for key, value in res.items():
             try:
                 if value.startswith('<') and value.endswith('>'):
                   res[key] = X[value[1:-1]]
             except AttributeError:
                 pass
         return res


data = yaml.load(input)
yaml.dump(data, sys.stdout)

which gives:

.map_values:
  a: D
  b: E
  c: F
section:
  stage: E

There are a few things to note:

  • using <...> is arbitrary, you don't need a both beginning and an end marker. I do recommend using some character(s) that has no special meaning in YAML, so you don't need to quote your values. You can e.g. use some well recognisable unicode point, but they tend to be a pain to type in an editor.
  • when from_yaml is called, the anchor is not yet fully constructed. So X is an empty dict that gets filled later on. The constructed with yield implements a two step process: we first give back res "as-is" back to the constructor, then later update it. The constructor stage of the loader knows how to handle this automatically when it gets the generator instead a "normal" value.
  • the try .. except is there to handle mapping values that are not strings (i.e. numbers, dates, boolean).
  • you can do substitutions in keys as well, just make sure you delete the old key

Since tags are standard YAML, the above should be doable one way or another in any YAML parser, independent of the language.

Upvotes: 1

Related Questions