Alexander C.
Alexander C.

Reputation: 739

How to keep comments in ruamel

I need to sort a YAML file with comments. I'm using ruamel.yaml library to keep comments from YAML, but when I do sort items comments are placed in wrong place.

people = """\
# manager of project
- title: manager

# owner of company
- title: owner
"""

import ruamel.yaml, sys

yaml = ruamel.yaml.YAML()
arr = yaml.load(people)
arr  = sorted(arr, key=lambda x: x['title'])
yaml.dump(arr, sys.stdout)

With this code I'm getting following output:

- title: manager

# owner of company
- title: owner

During sort comment for the first element is gone. How I can keep first comment for the list?

Upvotes: 4

Views: 5314

Answers (1)

Anthon
Anthon

Reputation: 76599

Your first comment, at the beginning of the document, has no preceeding node and gets a special place on the arr object (which is of type ruamel.yaml.comments.CommentedSeq). You can inspect this by doing print(arr.ca) (ca for comment attribute), directly after loading.

After loading there is a second comment attribute attached to the dict like object constructed from the mapping arr[0] and a third comment attribute attached to a dict like object constructed from arr[1] (in much the same way as the first comment is attached to arr).

The sorting operation is not done in place, hence only the comments associated with the elements that are sorted stay put. The arr after the asignment (a simple list), of the result of sorted() is not even the same type as the arr loaded from YAML (a CommentedSeq), and of course has no comments associated with it.

So what you need to do is preserve the comment information, make the result of sorted of the appropriate type and then assign the preserved information to that object. Fortunately this only requires one changed and one added line to your code:

import sys
import ruamel.yaml

people = """\
# manager of project
- title: manager

# owner of company
- title: owner
"""

yaml = ruamel.yaml.YAML()
arr = yaml.load(people)
root_comment = arr.ca
arr  = ruamel.yaml.comments.CommentedSeq(sorted(arr, key=lambda x: x['title']))
arr._yaml_comment = root_comment
yaml.dump(arr, sys.stdout)

which gives:

# manager of project
- title: manager

# owner of company
- title: owner

However in general this kind of extensive manipulation will get you into trouble. It is probably better to try doing this with an in-place sort.

Upvotes: 3

Related Questions