Reputation: 4175
I recently stumbled across a behavior with php's parse_yaml
, in which portions defined using anchor references within YAML are returned as references within the PHP array, giving this behavior:
$yaml = <<<YAML
a: &foo bar
b: *foo
YAML;
$arr = yaml_parse($yaml);
echo $arr["b"]; // returns "bar" as expected
// but when I update $arr["a"]:
$arr["a"] = "baz";
// $arr["b"] is also updated - because it's a reference!
echo $arr["b"]; // returns "baz"!
This is fine and all, but right now for my application I need to flatten these references so I can change the values separately.
I do have a bad solution for this, but is there a good one?
Here's the bad solution I'm using for now:
$yaml = <<<YAML
a: &foo bar
b: *foo
YAML;
$arr = yaml_parse(yaml_emit(yaml_parse($yaml))); // yaml_emit doesn't emit anchors/references
$arr["a"] = "baz";
echo $arr["b"]; // returns "bar"
Upvotes: 1
Views: 438
Reputation: 76872
If your input is in the file test.yaml
:
a: &foo bar # hello
b: *foo
Then loading and dumping that file using the following program, expands the YAML when it can be expanded (i.e. recursive data cannot be flattened).
import sys
from pathlib import Path
import ruamel.yaml
def null_op(*args, **kw):
return True
# prevent anchors from being preserved even if there are no aliases for them
ruamel.yaml.comments.CommentedBase.yaml_set_anchor = null_op
ruamel.yaml.scalarstring.ScalarString.yaml_set_anchor = null_op
ruamel.yaml.scalarint.ScalarInt.yaml_set_anchor = null_op
ruamel.yaml.scalarfloat.ScalarFloat.yaml_set_anchor = null_op
ruamel.yaml.scalarbool.ScalarBoolean.yaml_set_anchor = null_op
# backup the original file if not backed up yet
yaml_file = Path('test.yaml')
backup = yaml_file.with_suffix('.yaml.org')
if not backup.exists():
backup.write_bytes(yaml_file.read_bytes())
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
yaml.representer.ignore_aliases = null_op
data = yaml.load(yaml_file)
yaml.dump(data, yaml_file)
which gives:
a: bar # hello
b: bar # hello
Replacing the yaml_set_anchor
methods is necessary, as otherwise your output
would have the oringinal anchor both where it had anchor or alias.
As you can see, if you have a comment on the anchored data, this gets copied (and retains the original start column). Any comments after the alias dissappear. That doesn't affect the semantics of the loaded data and should not be a problem.
Upvotes: 1