Reputation: 21
I have this YAML file :
pb:
{EF:{16, 19}, EH:{16, 19}}
when I apply my flattendict
Python function, I get this
{('pb', 'EF', 16): None,
('pb', 'EF', 19): None,
('pb', 'EH', 16): None,
('pb', 'EH', 19): None}
I search the syntax of my YAML file as below, to get the same result (I want factoring my YAML node data)
pb:
{EF, EH}, {16, 19}}
Have you an idea?
Here my python flattendict function
#!/usr/bin/env python
#encoding: UTF-8
import codecs
import sys
import yaml
import pprint
import collections
from collections import Mapping
from itertools import chain
from operator import add
_FLAG_FIRST = object()
def flattenDict(d, join=add, lift=lambda x:x):
results = []
def visit(subdict, results, partialKey):
for k,v in subdict.items():
newKey = lift(k) if partialKey==_FLAG_FIRST else join(partialKey,lift(k))
if isinstance(v,Mapping):
visit(v, results, newKey)
else:
results.append((newKey,v))
visit(d, results, _FLAG_FIRST)
return results
testdata = yaml.safe_load(open('data.yaml', 'r'))
from pprint import pprint as pp
result = flattenDict(testdata, lift=lambda x:(x,))
pp(dict(result))
Upvotes: 1
Views: 1264
Reputation: 76902
In YAML you can have a complex flow node, even in a simple key (i.e. without ?
, markup). This is so in both YAML 1.2 and YAML 1.1. That means that this:
{a: 1, b: 2}: mapping
[1, 2, a]: sequence
is correct YAML.
The problem is that a mapping normally loads as a Python dict
and a sequence as a Python list
, both of which are mutable, cannot be hashed, and are not allowed as keys for a Python dict
(try executing python -c "{{'a': 1}: 2}"
).
PyYAML (which supports YAML 1.1) errors out on both of those lines.
Since Python has an immutable list
in the form of tuple
, I decided to implement loading of sequence keys in Python by constructing them as tuples in ruamel.yaml
(which supports YAML 1.2 and YAML 1.1). So the following works:
import sys
import ruamel.yaml
from pprint import pprint as pp
yaml_str = """\
[pb, EF, 16]:
[pb, EF, 19]:
[pb, EH, 16]:
[pb, EH, 19]:
"""
yaml = ruamel.yaml.YAML(typ='rt')
# yaml.indent(mapping=4, sequence=4, offset=2)
# yaml.preserve_quotes = True
data = yaml.load(yaml_str)
pp(data)
print('---------')
yaml.dump(data, sys.stdout)
printing:
{('pb', 'EF', 16): None,
('pb', 'EF', 19): None,
('pb', 'EH', 16): None,
('pb', 'EH', 19): None}
---------
[pb, EF, 16]:
[pb, EF, 19]:
[pb, EH, 16]:
[pb, EH, 19]:
If you try to load the above YAML in PyYAML it throws an exception:
found unhashable key
in "<unicode string>", line 1, column 1:
[pb, EF, 16]:
Notes:
If you don't want to round-trip, use typ="safe"
, it uses the faster C-loader, that also handles keys-that-are-sequences, but it doesn't as smartly dump those back, resulting in ?
marked explicit keys.
A proposal for a frozendict
for Python, did not get accepted, so there is no equivalent, not even in the standard library for a dict
what tuple
is for a list
, and ruamel.yaml
doesn't support mappings as keys out of the box. You can of course add this to ruamel.yaml
's Constructor if you have such a frozendict.
Although there is a frozenset
in Python, and a set in YAML, ruamel.yaml
does not currently accept the following as input:
!!set {a , b}: value
Probably needless to say: you cannot change the elements of such a key programmatically without deleting and re-adding the key-value pair.
Upvotes: 1