Reputation: 2535
A thing you can do with human-written YAML is this:
foo: &foo_anchor
key1: v1
key2: v2
key3: v3
bar:
<<: *foo_anchor
key2: override_value
I would like to programmatically generate output like that using PyYAML. It seems tricky! By default, as far as I can tell, PyYAML only generates anchors/references when it encounters equal objects (and the order is probably not defined, whereas in this example, bar
has to reference foo
, not the other way around). I've tried a few things—defining a YamlReference
class and checking for its tag in an overridden Dumper.serialize_node
method—but trying to do something like:
if node.tag.endswith('magic.prefix.YamlReference'):
alias = node.value[0].value
self.emit(yaml.events.AliasEvent(alias))
super(Dumper, self).anchor_node(node.value[1])
super(Dumper, self).serialize_node(node.value[1], parent, idx)
messes up the expected event stream. Is this possible?
Upvotes: 2
Views: 1758
Reputation: 76578
One way of achieving this is making a class that holds the appropriate
information about the merge information, and still allows lookup of
data['bar']['key1']
. Of course you need to properly dump this class with an appropriate representer.
This is what ruamel.yaml (disclaimer I am the author of that package) does to allow round-tripping of merged maps:
import sys
import ruamel.yaml
yaml_str = """\
foo: &foo_anchor
key1: v1
key2: v2
key3: v3
bar:
<<: *foo_anchor
key2: override_value
"""
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
yaml.dump(data, sys.stdout)
which gives:
foo: &foo_anchor
key1: v1
key2: v2
key3: v3
bar:
<<: *foo_anchor
key2: override_value
So I suggest you look at the class CommentedMap
and how this is
handled in constructor.py
and representer.py
.
If you can upgrade to ruamel.yaml
then you can do:
cm = ruamel.yaml.comments.CommentedMap
data = cm()
data['foo'] = foo = cm(key1='v1', key2='v2', key3='v3')
foo.yaml_set_anchor('foo_anchor')
data['bar'] = bar = cm(key2='override_value')
bar.add_yaml_merge([(0, foo)])
yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)
which gives something similar to what you expect, starting from scratch:
foo: &foo_anchor
key1: v1
key2: v2
key3: v3
bar:
<<: *foo_anchor
key2: override_value
And of course the following works as expected:
print(list(data['bar'].keys()))
print(data['bar']['key3'])
to give:
['key2', 'key1', 'key3']
v3
Upvotes: 0
Reputation: 39638
Well you can do something like this:
import yaml
class Merger(object):
pass
def merger_representer(dumper, data):
return dumper.represent_scalar(u'tag:yaml.org,2002:merge', '<<')
yaml.add_representer(Merger, merger_representer)
foo = {'key1': 'v1', 'key2': 'v2', 'key3': 'v3'}
root = {
'foo': foo,
'bar': {
Merger(): foo,
'key2': 'override_value'
}
}
print(yaml.dump(root, sort_keys=False))
Output is:
foo: &id001
key1: v1
key2: v2
key3: v3
bar:
<<: *id001
key2: override_value
sort_keys=False
ensures correct order of the keys, it requires Python >= 3.7 and PyYAML >= 5.1 (thanks @tinita). You have no control over the generated anchor name, but this YAML is equivalent to yours.
You need the Merger
class to force PyYAML to emit <<
(with a normal string key, it would emit '<<'
so that it doesn't get confused with a merge key).
Upvotes: 1