Jeroen Jacobs
Jeroen Jacobs

Reputation: 1525

pyyaml and using quotes for strings only

I have the following YAML file:

---
my_vars:
  my_env: "dev"
  my_count: 3

When I read it with PyYAML and dump it again, I get the following output:

---
my_vars:
  my_env: dev
  my_count: 3

The code in question:

with open(env_file) as f:
    env_dict = yaml.load(f)
    print(yaml.dump(env_dict, indent=4, default_flow_style=False, explicit_start=True))

I tried using the default_style parameter:

with open(env_file) as f:
    env_dict = yaml.load(f)
    print(yaml.dump(env_dict, indent=4, default_flow_style=False, explicit_start=True, default_style='"'))

But now I get:

---
"my_vars":
  "my_env": "dev"
  "my_count": !!int "3"

What do I need to do to keep the original formatting, without making any assumptions about the variable names in the YAML file?

Upvotes: 29

Views: 42473

Answers (3)

Anthon
Anthon

Reputation: 76578

I suggest you update to using YAML 1.2 (released in 2009) with the backwards compatible ruamel.yaml package instead of using PyYAML which implements most of YAML 1.1 (2005). (Disclaimer: I am the author of that package).

Then you just specify preserve_quotes=True when loading for round-tripping the YAML file:

import sys
import ruamel.yaml

yaml_str = """\
---
my_vars:
  my_env: "dev"    # keep "dev" quoted
  my_count: 3
"""
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.explicit_start = True
data = yaml.load(yaml_str)
ruamel.yaml.dump(data, sys.stdout)

which outputs (including the preserved comment):

---
my_vars:
  my_env: "dev"    # keep "dev" quoted
  my_count: 3

After loading the string scalars will be a subclass of string, to be able to accommodate the quoting info, but will work like a normal string for all other purposes. If you want to replace such a string though (dev to fgw) you have to cast the string to this subclass ( DoubleQuotedScalarString from ruamel.yaml.scalarstring).

When round-tripping ruamel.yaml by default preserves the order (by insertion) of the keys.

Upvotes: 22

aspiring1
aspiring1

Reputation: 446

you can use the following method to retain your double quoted scalar object in yaml:

Taking your yaml example:

---
my_vars:
  my_env: "dev"
  my_count: 3

Loading it into an env_dict (dictionary):

myyaml = '''
---
my_vars:
  my_env: "dev"
  my_count: 3
'''

env_dict = yaml.load(myyaml, yaml.FullLoader) # loading yaml

print(env_dict)
{'my_vars': {'my_env': 'dev', 'my_count': 3}}

# Define a quoted class, which uses style = '"' and add representer to yaml

class quoted(str):
    pass

def quoted_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"')
yaml.add_representer(quoted, quoted_presenter)


# Now, we update the dictionary env_dict as follows for the "dev" 
# value which needs to be a double quoted scalar

env_dict['my_vars'].update(my_env = quoted("dev")) # this makes "dev"
# a double quoted scalar

# Now, we dump the yaml as before

yaml.dump(env_dict, sys.stdout, indent=4, default_flow_style=False, explicit_start=True)

# which outputs

---
my_vars:
    my_count: 3
    my_env: "dev"

These links helped me to arrive at this answer : Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?

How can I control what scalar form PyYAML uses for my data?

Also, this one is a great article to read on To Quote or not to Quote?

Hope, this helps!

Upvotes: 10

Nobilis
Nobilis

Reputation: 7448

Right, so borrowing heavily from this answer, you can do something like this:

import yaml

# define a custom representer for strings
def quoted_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"')

yaml.add_representer(str, quoted_presenter)


env_file = 'input.txt'
with open(env_file) as f:
    env_dict = yaml.load(f)
    print yaml.dump(env_dict, default_flow_style=False)

However, this just overloads it on all strings types in the dictionary so it'll quote the keys as well, not just the values.

It prints:

"my_vars":
  "my_count": 3
  "my_env": "dev"

Is this what you want? Not sure what you mean by variable names, do you mean the keys?

Upvotes: 10

Related Questions