Cory
Cory

Reputation: 15615

Is it possible to do string substitution in YAML?

Is there a way to substitute string in YAML. For example, I would like to define sub once and use it throughout the YAML file.

sub: ['a', 'b', 'c']
command:
    params:
        cmd1:
            type: string
            enum :   # Get the list defined in 'sub'
            description: Exclude commands from the test list.
        cmd2:
            type: string
            enum:   # Get the list defined in 'sub'

Upvotes: 9

Views: 37021

Answers (1)

Anthon
Anthon

Reputation: 76599

You cannot really substitute string values in YAML, as in replacing a substring of some string with another substring¹. YAML does however have a possibility to mark a node (in your case the list ['a', 'b', 'c'] with an anchor and reuse that as an alias node.

Anchors take the form &some_id and are inserted before a node and alias nodes are specified by *some_id (instead of a node).

This is not the same as substitution on the string level, because during parsing of the YAML file the reference can be preserved. As is the case when loading YAML in Python for any anchors on collection types (i.e. not when using anchors on scalars):

import sys
import ruamel.yaml as yaml

yaml_str = """\
sub: &sub0 [a, b, c]
command:
    params:
        cmd1:
            type: string
            # Get the list defined in 'sub'
            enum : *sub0
            description: Exclude commands from the test list.
        cmd2:
            type: string
            # Get the list defined in 'sub'
            enum: *sub0
"""

data1 = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)

# the loaded elements point to the same list
assert data1['sub'] is data1['command']['params']['cmd1']['enum']

# change in cmd2
data1['command']['params']['cmd2']['enum'][3] = 'X'


yaml.dump(data1, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)

This will output:

sub: &sub0 [a, X, c]
command:
    params:
        cmd1:
            type: string
            # Get the list defined in 'sub'
            enum: *sub0
            description: Exclude commands from the test list.
        cmd2:
            type: string
            # Get the list defined in 'sub'
            enum: *sub0

please note that the original anchor name is preserved² in ruamel.yaml.

If you don't want the anchors and aliases in your output you can override the ignore_aliases method in the RoundTripRepresenter subclass of RoundTripDumper (that method takes two arguments, but using lambda *args: .... you don't have to know about that):

dumper = yaml.RoundTripDumper
dumper.ignore_aliases = lambda *args : True
yaml.dump(data1, sys.stdout, Dumper=dumper, indent=4)

Which gives:

sub: [a, X, c]
command:
    params:
        cmd1:
            type: string
            # Get the list defined in 'sub'
            enum: [a, X, c]
            description: Exclude commands from the test list.
        cmd2:
            type: string
            # Get the list defined in 'sub'
            enum: [a, X, c]

And this trick can be used to read in the YAML file as if you had done string substitution, by re-reading the material you dumped while ignoring the aliases:

data2 = yaml.load(yaml.dump(yaml.load(yaml_str, Loader=yaml.RoundTripLoader),
                    Dumper=dumper, indent=4), Loader=yaml.RoundTripLoader)

# these are lists with the same value
assert data2['sub'] == data2['command']['params']['cmd1']['enum']
# but the loaded elements do not point to the same list
assert data2['sub'] is not data2['command']['params']['cmd1']['enum']

data2['command']['params']['cmd2']['enum'][5] = 'X'

yaml.dump(data2, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)

And now only one 'b' is changed into 'X':

sub: [a, b, c]
command:
    params:
        cmd1:
            type: string
            # Get the list defined in 'sub'
            enum: [a, b, c]
            description: Exclude commands from the test list.
        cmd2:
            type: string
            # Get the list defined in 'sub'
            enum: [a, X, c]

As indicated above this is only necessary when using anchors/aliases on collection types, not when you use it on a scalar.


¹ Since YAML can create objects, it is however possible to influence the parser if those objects are created. This answer describes how to do that.
² Preserving the names was originally not available, but was implemented in update of ruamel.yaml

Upvotes: 17

Related Questions