Reputation: 67
---
"main":
"directory":
"options":
"directive": 'options'
"item":
"options": 'Stuff OtherStuff MoreStuff'
"directoryindex":
"item":
"directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
"fileetag":
"item":
"fileetag": 'Stuff'
"keepalive":
"item":
"keepalive": 'Stuff'
"keepalivetimeout":
"item":
"keepalivetimeout": 2
above is a YAML file which I need to parse, edit then dump. I have chosen to do so with pyyaml on python 2.7 (I need to use this). I have been able to parse and edit.
However, since the YAML has different styles for keys and different styles for strings and integers I cannot set a default style. I am now wondering how I can use pyyaml to dump different styles for the different types.
Below is what I do to parse and edit
infile = yaml.load(open('yamlfile'))
#Recursive function to loop through nested dictionary
def edit(d,keytoedit=None,newvalue=None):
for key, value in d.iteritems():
if isinstance(value, dict) and key == keytoedit and 'item' in value:
value[value.iterkeys().next()] = {keytoedit:newvalue}
edit(value,keytoedit=keytoedit,newvalue=newvalue)
elif isinstance(value, dict) and keytoedit in value and 'item' not in value and key != 'main':
value[keytoedit] = newvalue
edit(value,keytoedit=keytoedit,newvalue=newvalue)
elif isinstance(value, dict):
edit(value,keytoedit=keytoedit,newvalue=newvalue)
outfile = file('outfile','w')
yaml.dump(infile, outfile,default_flow_style=False)
So, I am wondering how I can achieve that, if I use the default_style in yaml.dump all the types get the same style and I need to adhere to the original YAML files standard.
Can I somehow specify styles for specific types with pyyaml?
Edit: Here is what i get so far, the missing piece is the double qoutes on the keys and the single qoutes on the strings.
main:
directory:
options:
directive: options
item:
options: Stuff OtherStuff MoreStuff
directoryindex:
item:
directoryindex: stuff.html otherstuff.htm morestuff.html
fileetag:
item:
fileetag: Stuff
keepalive:
item:
keepalive: 'On'
keepalivetimeout:
item:
keepalivetimeout: 2
Upvotes: 2
Views: 3865
Reputation: 1
Use ruamel.yaml instead
it is better documented than pyyaml: https://pypi.org/project/ruamel.yaml/
Example of the template.yaml file I want to read:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
lambda_explicit_matchning
Sample SAM Template for lambda_explicit_matchning
# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
Function:
Timeout: 900
Resources:
ExplicitAlgoFunction:
Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
Properties:
MemorySize: 3008
As you can see in my example, we have quotes for string and no quotes for integer.
Then to load and parse that yaml file, it's as simple as that (no need to worry about style)
from ruamel.yaml import YAML
yaml = YAML()
file = open("template.yaml", 'r')
sam_yaml = file.read()
sam_yaml = yaml.load(sam_yaml)
The ruamel library can read the yaml file without worrying about the style. It's as simple as that :D
Upvotes: 0
Reputation: 76682
You can at least preserve the original flow/block style for the various elements with the normal yaml.dump()
for some value of "normal".
What you need is a loader that saves the flow/bcock style information while reading the data, subclass the normal types that have the style (mappings/dicts resp. sequences/lists) so that they behave like the python constructs normally returned by the loader, but have the style information attached. Then on the way out using yaml.dump
you provide a custom dumper that takes this style information into account.
I use the normal yaml.dump
in my enhanced version of PyYAML called ruamel.yaml, but have special loader and dumper class RoundTripDumper
(and a RoundTripLoader
for yaml.load
) that preserve the flow/block style (and any comments you might have in the file:
import ruamel.yaml as yaml
infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)
for key, value in infile['main'].items():
if key == 'keepalivetimeout':
item = value['item']
item['keepalivetimeout'] = 400
print yaml.dump(infile, Dumper=yaml.RoundTripDumper)
gives you:
main:
directory:
options:
directive: options
item:
options: Stuff OtherStuff MoreStuff
directoryindex:
item:
directoryindex: stuff.htm otherstuff.htm morestuff.html
fileetag:
item:
fileetag: Stuff
keepalive:
item:
keepalive: Stuff
keepalivetimeout:
item:
keepalivetimeout: 400
If you cannot install ruamel.yaml
you can pull out the code from my repository and include it in your code, AFAIK PyYAML has not been upgraded since I started working on this.
I currently don't preserve the superfluous quote on the scalars, but I do preserve the chomping information (for multiline statements starting with '|'. That information is thrown out really early on in the input processing of the YAML file and would require multiple changes to be preserved.
Since you seem to be having different quotes for key and value string scalars, you can achieve the output you want by overriding process_scalar
(part of the Emitter in emitter.py) to add the quotes based on the string scalar being a key or not and being an integer or not:
import ruamel.yaml as yaml
# the scalar emitter from emitter.py
def process_scalar(self):
if self.analysis is None:
self.analysis = self.analyze_scalar(self.event.value)
if self.style is None:
self.style = self.choose_scalar_style()
split = (not self.simple_key_context)
# VVVVVVVVVVVVVVVVVVVV added
try:
x = int(self.event.value) # might need to expand this
except:
# we have string
if split:
self.style = "'"
else:
self.style = '"'
# ^^^^^^^^^^^^^^^^^^^^
# if self.analysis.multiline and split \
# and (not self.style or self.style in '\'\"'):
# self.write_indent()
if self.style == '"':
self.write_double_quoted(self.analysis.scalar, split)
elif self.style == '\'':
self.write_single_quoted(self.analysis.scalar, split)
elif self.style == '>':
self.write_folded(self.analysis.scalar)
elif self.style == '|':
self.write_literal(self.analysis.scalar)
else:
self.write_plain(self.analysis.scalar, split)
self.analysis = None
self.style = None
if self.event.comment:
self.write_post_comment(self.event)
infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)
for key, value in infile['main'].items():
if key == 'keepalivetimeout':
item = value['item']
item['keepalivetimeout'] = 400
dd = yaml.RoundTripDumper
dd.process_scalar = process_scalar
print '---'
print yaml.dump(infile, Dumper=dd)
gives you:
---
"main":
"directory":
"options":
"directive": 'options'
"item":
"options": 'Stuff OtherStuff MoreStuff'
"directoryindex":
"item":
"directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
"fileetag":
"item":
"fileetag": 'Stuff'
"keepalive":
"item":
"keepalive": 'Stuff'
"keepalivetimeout":
"item":
"keepalivetimeout": 400
which is quite close to what you asked for.
Upvotes: 2