hoodakaushal
hoodakaushal

Reputation: 1293

Python update YAML without changing formatting

I want to update a particular property in a yaml file while leaving the rest untouched, including formatting, comments etc. I'm using raumel.yaml. But when I save the file it's formatting gets messed up. I'm using python 3.9.5 and raumel.yaml Version: 0.17.10.

test.yaml:

sso:
  version: 1.0.0
  configs:
  - configName: config1.conf
    fileContent: 'startDelaySeconds: 0

      lowercaseOutputName: true

      lowercaseOutputLabelNames: false

      whitelistObjectNames: [''org.apache.commons.pool2:*'']

      blacklistObjectNames: []

      rules:

      - pattern: ".*"

      '

I update the version with:

from ruamel.yaml import YAML
yaml = YAML()
with open('test.yaml') as f:
    test = yaml.load(f)
test['sso']['version'] = '1.0.1'
with open('test2.yaml', 'w') as f:
    yaml.dump(test, f)

But the fileContent of the saved file is changed in format (newlines replaces by \n etc). test2.yaml:

sso:
  version: 1.0.1
  configs:
  - configName: config1.conf
    fileContent: "startDelaySeconds: 0\nlowercaseOutputName: true\nlowercaseOutputLabelNames:\
      \ false\nwhitelistObjectNames: ['org.apache.commons.pool2:*']\nblacklistObjectNames:\
      \ []\nrules:\n- pattern: \".*\"\n"

Upvotes: 1

Views: 3101

Answers (1)

Anthon
Anthon

Reputation: 76578

What you probably missed is that the quotes in the value for your output changes from single to double quotes. By default ruamel.yaml doesn't try to preserve quotes, thereby being able to remove any superfluous quotes, or in this case use a more compact representation. using double quoting that doesn't need single quotes within the scalar to be duplicated.

You can preserve the quotes by setting the .preserve_quotes attribute:

import sys
from ruamel.yaml import YAML

yaml = YAML()
yaml.preserve_quotes = True
with open('test.yaml') as f:
    test = yaml.load(f)
test['sso']['version'] = '1.0.1'
yaml.dump(test, sys.stdout)

which gives:

sso:
  version: 1.0.1
  configs:
  - configName: config1.conf
    fileContent: 'startDelaySeconds: 0

      lowercaseOutputName: true

      lowercaseOutputLabelNames: false

      whitelistObjectNames: [''org.apache.commons.pool2:*'']

      blacklistObjectNames: []

      rules:

      - pattern: ".*"

      '

I recommend considering using a literal style scalar for values that contain newlines, as in those you don't need to double the newline, nor need to double single quotes in your scalar:

yaml_str = """
sso:
  version: 1.0.0
  configs:
  - configName: config1.conf
    fileContent: |
      startDelaySeconds: 0
      lowercaseOutputName: true
      lowercaseOutputLabelNames: false
      whitelistObjectNames: ['org.apache.commons.pool2:*']
      blacklistObjectNames: []
      rules:
      - pattern: ".*"
  
"""

import sys
from ruamel.yaml import YAML

yaml = YAML()
yaml.preserve_quotes = True
with open('test.yaml') as f:
    test = yaml.load(f)

data = yaml.load(yaml_str)
assert data['sso']['configs'][0]['fileContent']  == test['sso']['configs'][0]['fileContent']

Upvotes: 3

Related Questions