Reputation: 466
I've been trying to dump a dictionary to a YAML file. The problem is that the program that imports the YAML file needs the keywords in a specific order. This order is not alphabetically.
import yaml
import os
baseFile = 'myfile.dat'
lyml = [{'BaseFile': baseFile}]
lyml.append({'Environment':{'WaterDepth':0.,'WaveDirection':0.,'WaveGamma':0.,'WaveAlpha':0.}})
CaseName = 'OrderedDict.yml'
CaseDir = r'C:\Users\BTO\Documents\Projects\Mooring code testen'
CaseFile = os.path.join(CaseDir, CaseName)
with open(CaseFile, 'w') as f:
yaml.dump(lyml, f, default_flow_style=False)
This produces a *.yml file which is formatted like this:
- BaseFile: myfile.dat
- Environment:
WaterDepth: 0.0
WaveAlpha: 0.0
WaveDirection: 0.0
WaveGamma: 0.0
But what I want is that the order is preserved:
- BaseFile: myfile.dat
- Environment:
WaterDepth: 0.0
WaveDirection: 0.0
WaveGamma: 0.0
WaveAlpha: 0.0
Is this possible?
Upvotes: 43
Views: 52133
Reputation: 23
oyaml is a python library which preserves dict ordering when dumping. It is specifically helpful in more complex cases where the dictionary is nested and may contain lists.
Once installed:
import oyaml as yaml
with open(CaseFile, 'w') as f:
f.write(yaml.dump(lyml))
Upvotes: 1
Reputation: 781
yaml.dump
has a sort_keys
keyword argument that is set to True
by default. Set it to False
to not reorder:
with open(CaseFile, 'w') as f:
yaml.dump(lyml, f, default_flow_style=False, sort_keys=False)
Upvotes: 78
Reputation: 27684
Use an OrderedDict instead of dict. Run the below setup code at the start. Now yaml.dump
, should preserve the order. More details here and here
def setup_yaml():
""" https://stackoverflow.com/a/8661021 """
represent_dict_order = lambda self, data: self.represent_mapping('tag:yaml.org,2002:map', data.items())
yaml.add_representer(OrderedDict, represent_dict_order)
setup_yaml()
Example: https://pastebin.com/raw.php?i=NpcT6Yc4
Upvotes: 35
Reputation: 6836
PyYAML supports representer
to serialize a class instance to a YAML node.
yaml.YAMLObject uses metaclass magic to register a constructor, which transforms a YAML node to a class instance, and a representer, which serializes a class instance to a YAML node.
Add following lines above your code:
def represent_dictionary_order(self, dict_data):
return self.represent_mapping('tag:yaml.org,2002:map', dict_data.items())
def setup_yaml():
yaml.add_representer(OrderedDict, represent_dictionary_order)
setup_yaml()
Then you can use OrderedDict
to preserve the order in yaml.dump()
:
import yaml
from collections import OrderedDict
def represent_dictionary_order(self, dict_data):
return self.represent_mapping('tag:yaml.org,2002:map', dict_data.items())
def setup_yaml():
yaml.add_representer(OrderedDict, represent_dictionary_order)
setup_yaml()
dic = OrderedDict()
dic['a'] = 1
dic['b'] = 2
dic['c'] = 3
print(yaml.dump(dic))
# {a: 1, b: 2, c: 3}
Upvotes: 8
Reputation: 76802
Your difficulties are a result of assumptions on multiple levels that are incorrect and, depending on your YAML parser, might not be transparently resolvable.
In Python's dict
the keys are unordered (at least for Python < 3.6). And even though the keys have some order in the source file, as soon as they are in the dict
they aren't:
d = {'WaterDepth':0.,'WaveDirection':0.,'WaveGamma':0.,'WaveAlpha':0.}
for key in d:
print key
gives:
WaterDepth
WaveGamma
WaveAlpha
WaveDirection
If you want your keys ordered you can use the collections.OrderedDict type (or my own ruamel.ordereddict type which is in C and more than an order of magnitude faster), and you have to add the keys ordered, either as a list of tuples:
from ruamel.ordereddict import ordereddict
# from collections import OrderedDict as ordereddict # < this will work as well
d = ordereddict([('WaterDepth', 0.), ('WaveDirection', 0.), ('WaveGamma', 0.), ('WaveAlpha', 0.)])
for key in d:
print key
which will print the keys in the order they were specified in the source.
The second problem is that even if a Python dict has some key ordering that happens to be what you want, the YAML specification does explicitly say that mappings are unordered and that is the way e.g. PyYAML implements the dumping of Python dict to YAML mapping (And the other way around). Also, if you dump an ordereddict or OrderedDict you normally don't get the plain YAML mapping that you indicate you want, but some tagged YAML entry.
As losing the order is often undesirable, in your case because your reader assumes some order, in my case because that made it difficult to compare versions because key ordering would not be consistent after insertion/deletion, I implemented round-trip consistency in ruamel.yaml so you can do:
import sys
import ruamel.yaml as yaml
yaml_str = """\
- BaseFile: myfile.dat
- Environment:
WaterDepth: 0.0
WaveDirection: 0.0
WaveGamma: 0.0
WaveAlpha: 0.0
"""
data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
print(data)
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)
which gives you exactly your output result. data
works as a dict (and so does `data['Environment'], but underneath they are smarter constructs that preserve order, comments, YAML anchor names etc). You can of course change these (adding/deleting key-value pairs), which is easy, but you can also build these from scratch:
import sys
import ruamel.yaml as yaml
from ruamel.yaml.comments import CommentedMap
baseFile = 'myfile.dat'
lyml = [{'BaseFile': baseFile}]
lyml.append({'Environment': CommentedMap([('WaterDepth', 0.), ('WaveDirection', 0.), ('WaveGamma', 0.), ('WaveAlpha', 0.)])})
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)
Which again prints the contents with keys in the order you want them. I find the later less readable, than when starting from a YAML string, but it does construct the lyml data structure somewhat faster.
Upvotes: 1