user45183
user45183

Reputation: 609

How to dump a folded scalar to YAML in Python (using ruamel?)

I've been scouring stackoverflow looking for a way to dump a folded scalar in YAML format using Python. A common answer is from user Anthon who suggests using his ruamel Python library. I took the advice but I cannot figure out how to dump a long Python string value in folded style.

In Anthon's answer's he/she often uses a hard-coded string with the folded style representer ">" to illustrate his point like so:

yaml_str = """\
long: >
  Line1
  Line2
  Line3
"""
data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
print(yaml.dump(data, Dumper=yaml.RoundTripDumper))

I'm not sure how to translate that example into my own code where the string value I'd like to dump comes not from a hard-coded value with the folded representer already in it, but from a Django request (well it could come from anywhere really, the point is, I'm not constructing the string in my code manually with ">").

Am I really meant to do something like:

stringToDumpFolded = "ljasdfl\n\nksajdf\r\n;lak'''sjf"

data = "Key: > \n" + stringToDumpFolded

ruamel.yaml.dump(data, f, Dumper=yaml.RoundTripDumper))

Otherwise, given a long unicode string variable, how do I use ruamel to dump it to a file?

Upvotes: 3

Views: 1650

Answers (1)

Anthon
Anthon

Reputation: 76742

Starting with 0.15.61 it is possible to round-trip folded scalars in ruamel.yaml:

import sys
import ruamel.yaml

yaml_str = """\
long: >
  Line1
  Line2
  Line3
"""

yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
print(type(data['long']), data['long'].fold_pos, end='\n-----\n')
yaml.dump(data, sys.stdout)

which gives:

<class 'ruamel.yaml.scalarstring.FoldedScalarString'> [5, 11]
-----
long: >
  Line1
  Line2
  Line3

The print of the type is only there to show how you could establish yourself what object to create if working from scratch:

from ruamel.yaml.scalarstring import FoldedScalarString as folded

s = folded('Line1 Line2 Line3\n')
data = dict(long=s)
yaml.dump(data, sys.stdout)

which gives a folded scalar, but probably not the way you want:

long: >
  Line1 Line2 Line3

To get this to fold you have to provide the .fold_pos attribute. That attribute has to be a list (or some reversable iterable) with the position of space characters in the string, where folds need to be inserted:

s = folded('Line1 Line2 Line3\n')
s.fold_pos = [5, 11]
data = dict(long=s)
yaml.dump(data, sys.stdout)

which returns the output you expected:

long: >
  Line1
  Line2
  Line3

Since you seem to want all spaces to be folded, you can also do something like:

import re
s.fold_pos = [x.start() for x in re.finditer(' ', s)]

Upvotes: 2

Related Questions