Ben
Ben

Reputation: 466

Dumping a dictionary to a YAML file while preserving order

I've been trying to dump a dictionary to a YAML file. The problem is that the program that imports the YAML file needs the keywords in a specific order. This order is not alphabetically.

import yaml
import os 

baseFile = 'myfile.dat'
lyml = [{'BaseFile': baseFile}]
lyml.append({'Environment':{'WaterDepth':0.,'WaveDirection':0.,'WaveGamma':0.,'WaveAlpha':0.}})

CaseName = 'OrderedDict.yml'
CaseDir = r'C:\Users\BTO\Documents\Projects\Mooring code testen'
CaseFile = os.path.join(CaseDir, CaseName)
with open(CaseFile, 'w') as f:
    yaml.dump(lyml, f, default_flow_style=False)

This produces a *.yml file which is formatted like this:

- BaseFile: myfile.dat
- Environment:
    WaterDepth: 0.0
    WaveAlpha: 0.0
    WaveDirection: 0.0
    WaveGamma: 0.0

But what I want is that the order is preserved:

- BaseFile: myfile.dat
- Environment:
    WaterDepth: 0.0
    WaveDirection: 0.0
    WaveGamma: 0.0
    WaveAlpha: 0.0

Is this possible?

Upvotes: 43

Views: 52133

Answers (5)

Daniel Chamami
Daniel Chamami

Reputation: 23

oyaml is a python library which preserves dict ordering when dumping. It is specifically helpful in more complex cases where the dictionary is nested and may contain lists.

Once installed:

import oyaml as yaml

with open(CaseFile, 'w') as f:
    f.write(yaml.dump(lyml))

Upvotes: 1

Eric
Eric

Reputation: 781

yaml.dump has a sort_keys keyword argument that is set to True by default. Set it to False to not reorder:

with open(CaseFile, 'w') as f:
    yaml.dump(lyml, f, default_flow_style=False, sort_keys=False)

Upvotes: 78

balki
balki

Reputation: 27684

Use an OrderedDict instead of dict. Run the below setup code at the start. Now yaml.dump, should preserve the order. More details here and here

def setup_yaml():
  """ https://stackoverflow.com/a/8661021 """
  represent_dict_order = lambda self, data:  self.represent_mapping('tag:yaml.org,2002:map', data.items())
  yaml.add_representer(OrderedDict, represent_dict_order)    
setup_yaml()

Example: https://pastebin.com/raw.php?i=NpcT6Yc4

Upvotes: 35

Akif
Akif

Reputation: 6836

PyYAML supports representer to serialize a class instance to a YAML node.

yaml.YAMLObject uses metaclass magic to register a constructor, which transforms a YAML node to a class instance, and a representer, which serializes a class instance to a YAML node.

Add following lines above your code:

def represent_dictionary_order(self, dict_data):
    return self.represent_mapping('tag:yaml.org,2002:map', dict_data.items())

def setup_yaml():
    yaml.add_representer(OrderedDict, represent_dictionary_order)

setup_yaml()

Then you can use OrderedDict to preserve the order in yaml.dump():

import yaml
from collections import OrderedDict

def represent_dictionary_order(self, dict_data):
    return self.represent_mapping('tag:yaml.org,2002:map', dict_data.items())

def setup_yaml():
    yaml.add_representer(OrderedDict, represent_dictionary_order)

setup_yaml()    

dic = OrderedDict()

dic['a'] = 1
dic['b'] = 2
dic['c'] = 3

print(yaml.dump(dic))
# {a: 1, b: 2, c: 3}

Upvotes: 8

Anthon
Anthon

Reputation: 76802

Your difficulties are a result of assumptions on multiple levels that are incorrect and, depending on your YAML parser, might not be transparently resolvable.

In Python's dict the keys are unordered (at least for Python < 3.6). And even though the keys have some order in the source file, as soon as they are in the dict they aren't:

d = {'WaterDepth':0.,'WaveDirection':0.,'WaveGamma':0.,'WaveAlpha':0.}
for key in d:
    print key

gives:

WaterDepth
WaveGamma
WaveAlpha
WaveDirection

If you want your keys ordered you can use the collections.OrderedDict type (or my own ruamel.ordereddict type which is in C and more than an order of magnitude faster), and you have to add the keys ordered, either as a list of tuples:

from ruamel.ordereddict import ordereddict
# from collections import OrderedDict as ordereddict  # < this will work as well
d = ordereddict([('WaterDepth', 0.), ('WaveDirection', 0.), ('WaveGamma', 0.), ('WaveAlpha', 0.)])
for key in d:
    print key

which will print the keys in the order they were specified in the source.

The second problem is that even if a Python dict has some key ordering that happens to be what you want, the YAML specification does explicitly say that mappings are unordered and that is the way e.g. PyYAML implements the dumping of Python dict to YAML mapping (And the other way around). Also, if you dump an ordereddict or OrderedDict you normally don't get the plain YAML mapping that you indicate you want, but some tagged YAML entry.

As losing the order is often undesirable, in your case because your reader assumes some order, in my case because that made it difficult to compare versions because key ordering would not be consistent after insertion/deletion, I implemented round-trip consistency in ruamel.yaml so you can do:

import sys
import ruamel.yaml as yaml

yaml_str = """\
- BaseFile: myfile.dat
- Environment:
    WaterDepth: 0.0
    WaveDirection: 0.0
    WaveGamma: 0.0
    WaveAlpha: 0.0
"""

data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
print(data)
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)

which gives you exactly your output result. data works as a dict (and so does `data['Environment'], but underneath they are smarter constructs that preserve order, comments, YAML anchor names etc). You can of course change these (adding/deleting key-value pairs), which is easy, but you can also build these from scratch:

import sys
import ruamel.yaml as yaml
from ruamel.yaml.comments import CommentedMap

baseFile = 'myfile.dat'
lyml = [{'BaseFile': baseFile}]
lyml.append({'Environment': CommentedMap([('WaterDepth', 0.), ('WaveDirection', 0.), ('WaveGamma', 0.), ('WaveAlpha', 0.)])})
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)

Which again prints the contents with keys in the order you want them. I find the later less readable, than when starting from a YAML string, but it does construct the lyml data structure somewhat faster.

Upvotes: 1

Related Questions