MediocreTechie
MediocreTechie

Reputation: 73

How do I dump an OrderedDict out as a YAML file?

I wrote a simple script that I finally use ruamel.yaml to publish out to a YAML file (example shown below). I used collections.OrderedDict so that I can alphabetically reorder the keys, but even after re-ordering and converting it back to a dictionary using json.load/json.dumps I am unable to print it out in an ordered fashion.

I understand the YAML specification doesn't care about the order, but I personally would like the YAML file ordered, what's the correct method to go about this using the ruamel.yaml module?

logging.to_syslog: 'false'
statsbeat:
  multicast_interface_name: 'p1p1'
  primary_field_name: 'primary'
  udp_address: '239.253.0.50:20016'
  all_documents_index: 'statsall-${statsbeat.exchange_code}-${statsbeat.name}'
  exchange_code: 'd'
  primary_field_algorithm: 'range'
  cloud_type: 'none'
  primary_field_algorithm_range: '1-48'
  name: 'otpr'
logging.files:
  permissions: '0644'
  rotateeverybytes: 52428800
  keepfiles: 7
  name: '${statsbeat.name}.log'

Upvotes: 2

Views: 3784

Answers (1)

Anthon
Anthon

Reputation: 76802

When you load a YAML file in ruamel.yaml's default round-trip mode, then a sequence is loaded into a CommentedMap (defined in ruamel.yaml.comments.py). That CommentedMap is a subclass of OrderedDict (or ruamel.ordereddict on Python2).

So one thing you can do is convert your OrderedDict to a CommentedMap:

import sys
import ruamel.yaml
from ruamel.yaml.comments import CommentedMap
from collections import OrderedDict

data = OrderedDict([
  ('logging.to_syslog', 'false'),
  ('statsbeat', OrderedDict([
    ('multicast_interface_name',  'p1p1'),
    ('primary_field_name',  'primary'),
    ('udp_address',  '239.253.0.50:20016'),
    ('all_documents_index',  'statsall-${statsbeat.exchange_code}-${statsbeat.name}'),
    ('exchange_code',  'd'),
    ('primary_field_algorithm',  'range'),
    ('cloud_type',  'none'),
    ('primary_field_algorithm_range',  '1-48'),
    ('name',  'otpr'),
  ])),
  ('logging.files', OrderedDict([
    ('permissions',  '0644'),
    ('rotateeverybytes',  52428800),
    ('keepfiles',  7),
    ('name',  '${statsbeat.name}.log'),
  ])),
])

def comseq(d):
    if isinstance(d, OrderedDict):
        cs = CommentedMap()
        for k, v in d.items():
            cs[k] = comseq(v)
        return cs
    return d


data = comseq(data)


yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)

which gives:

logging.to_syslog: 'false'
statsbeat:
  multicast_interface_name: p1p1
  primary_field_name: primary
  udp_address: 239.253.0.50:20016
  all_documents_index: statsall-${statsbeat.exchange_code}-${statsbeat.name}
  exchange_code: d
  primary_field_algorithm: range
  cloud_type: none
  primary_field_algorithm_range: 1-48
  name: otpr
logging.files:
  permissions: '0644'
  rotateeverybytes: 52428800
  keepfiles: 7
  name: ${statsbeat.name}.log

(If you want the superfluous single quotes as in your example, you can cast the string to SingleQuotedScalarString, imported from ruamel.yaml.scalarstring).


But what is probably easier, is to instruct the representer to represent OrderedDict in the same way as a CommentedMap. Assuming the same import and definition of data as before, you do:

from ruamel.yaml.representer import RoundTripRepresenter

class MyRepresenter(RoundTripRepresenter):
    pass

ruamel.yaml.add_representer(OrderedDict, MyRepresenter.represent_dict, 
                            representer=MyRepresenter)

yaml = ruamel.yaml.YAML()
yaml.Representer = MyRepresenter

yaml.dump(data, sys.stdout)

with exactly the same result as before.

Upvotes: 2

Related Questions