Balanced
Balanced

Reputation: 48

How to prevent ruamel from breaking a line in the middle of a value?

I have two questions!

  1. Is there a way to prevent round_trip_dump or even just a regular dump from breaking a line in the middle of a sentence? Whenever I have a long sentence (i.e. a description) in my YAML file, and I use a script to change some stuff, it will break a line and break my file.

  2. What is the difference between dump and round_trip_dump?

This is my code:

import ruamel.yaml

yml = "test.yml"

data = ruamel.yaml.round_trip_load(open(yml, "r"), preserve_quotes=True)
ruamel.yaml.round_trip_dump(data, open(yml, "w"))

This is my current file:

person_1:
  name: Michael
  age: 20
  description: A very cool person with some really cool text, to show this issue, how do I fix this, it's going to break a line in a few words

I want to simply load and dump it (and fix the indentation, but in this case, it's already fixed). So, when I run my code, I get this:

person_1:
   name: Michael
   age: 20
   description: A very cool person with some really cool text, to show this issue,
   how do I fix this, it's going to break a line in a few words

Upvotes: 2

Views: 2003

Answers (1)

Anthon
Anthon

Reputation: 76578

First of all, you cannot actually get the output that you are getting. That is actually invalid YAML. the line in the file startign with spaces and how do I, will (have to) be indented more than the key description. Secondly without specifying a different indent, you cannot get a three space indent in ruamel.yaml.

So that output is either not from the program you present, or you made formatting errors.

The output you get is:

person_1:
  name: Michael
  age: 20
  description: A very cool person with some really cool text, to show this issue,
    how do I fix this, it's going to break a line in a few words

and this is semantically the same as your input. That last (how do...) line is a continuation line for the plain scalar starting with A very cool. On loading there will be no newline, just a space between issue, and how.

That you get the continuation line is because your content is wider than the default output width, so the easiest is to increase that from the default "best width" of 80.

I also do recommend using the new API (which is already getting old), and following the filename extension .yaml (this has been the recommended extension since Sep 2006).

import sys
import ruamel.yaml

yaml_file = "test.yaml"

yaml = ruamel.yaml.YAML()
yaml.indent(mapping=3, sequence=2, offset=0)  # sequence and offset have their default values here
yaml.preserve_quotes = True
yaml.width = 800    # this is the output line width after which wrapping occurs
with open(yaml_file) as fp:
    data = yaml.load(fp)
with open(yaml_file, 'w') as fp:
    yaml.dump(data, fp)

After which the output file looks like the original, but indented three positions:

person_1:
   name: Michael
   age: 20
   description: A very cool person with some really cool text, to show this issue, how do I fix this, it's going to break a line in a few words

The default in the new API is round-trip (i.e. YAML(typ='rt')), if you want the equivalent of the old function dump() (without Dumper argument), you should use yaml = YAML(typ='unsafe'). Dumping in itself is not unsafe, but the equivalent old style load() function is.

The difference between rt and unsafe (which largely equals the difference round_trip_dump and dump) is primarily that the former knows about all the special things that the round-trip loader preserves:

  • style
  • commments
  • anchor/alias names
  • integer "style" (octal, binary, hex, leading zero's)
  • float "style" (scientific notation)
  • optional: quotes around scalars
  • dumping tagged objects loaded from YAML (without having special definitions registered)

The unsafe/normal dump knows how to dump most Python objects, whereas you have to register special dumpers if you use the round-trip (or safe) dumper.

You should not try to dump using the unsafe dumper what you loaded with the round-trip loader.

yaml_i = ruamel.yaml.YAML()
yaml_o = ruamel.yaml.YAML(typ='unsafe')
with open(yaml_file) as fp:
    data = yaml_i.load(fp)
with open(yaml_file, 'w') as fp:
    yaml_o.dump(data, fp)

It will probably work, but the output is "unreadable" (and comments etc will be lost). The other way around works, but is, of course, not recommended.

Upvotes: 1

Related Questions