Reputation: 739
I need to sort a YAML file with comments. I'm using ruamel.yaml library to keep comments from YAML, but when I do sort items comments are placed in wrong place.
people = """\
# manager of project
- title: manager
# owner of company
- title: owner
"""
import ruamel.yaml, sys
yaml = ruamel.yaml.YAML()
arr = yaml.load(people)
arr = sorted(arr, key=lambda x: x['title'])
yaml.dump(arr, sys.stdout)
With this code I'm getting following output:
- title: manager
# owner of company
- title: owner
During sort comment for the first element is gone. How I can keep first comment for the list?
Upvotes: 4
Views: 5314
Reputation: 76599
Your first comment, at the beginning of the document, has no
preceeding node and gets a special place on the arr
object (which is
of type ruamel.yaml.comments.CommentedSeq
). You can inspect this by
doing print(arr.ca)
(ca for comment attribute), directly after
loading.
After loading there is a second comment attribute attached to the dict
like object constructed from the mapping arr[0]
and a third comment
attribute attached to a dict like object constructed from arr[1]
(in
much the same way as the first comment is attached to arr
).
The sorting operation is not done in place, hence only the
comments associated with the elements that are sorted stay put. The
arr
after the asignment (a simple list
), of the result of
sorted()
is not even the same type as the arr
loaded from YAML (a
CommentedSeq
), and of course has no comments associated with it.
So what you need to do is preserve the comment information, make the result of sorted of the appropriate type and then assign the preserved information to that object. Fortunately this only requires one changed and one added line to your code:
import sys
import ruamel.yaml
people = """\
# manager of project
- title: manager
# owner of company
- title: owner
"""
yaml = ruamel.yaml.YAML()
arr = yaml.load(people)
root_comment = arr.ca
arr = ruamel.yaml.comments.CommentedSeq(sorted(arr, key=lambda x: x['title']))
arr._yaml_comment = root_comment
yaml.dump(arr, sys.stdout)
which gives:
# manager of project
- title: manager
# owner of company
- title: owner
However in general this kind of extensive manipulation will get you into trouble. It is probably better to try doing this with an in-place sort.
Upvotes: 3