mogul
mogul

Reputation: 4553

yaml.load, force dict keys to strings

In Python3 I am loading a piece of YAML. The loader tries to guess the right types but I'm not quite satisfied. I want dict keys always to be strings.

First a minimalistic piece of YAML to demonstrate, easy to paste directly into your python interpreter. Needless to say my real world data is far more complex.

txt = """
---
one: 1
2: two
"""

First the "regular" load:

yaml.load(txt)
{2: 'two', 'one': 1}

Notice how the key 2 got loaded as a number and not as a string. Then lets try something different:

yaml.load(txt, Loader=yaml.BaseLoader)
{'2': 'two', 'one': '1'}

Now everything is made as strings. Unfortunately also 1, which, as a value, I need as a number.

So I can can either have both keys and values forced to strings, or none.

I can of course make a post-processor that traverses through the loaded data and copies it to a new variable, with dict keys forced to strings, but I imagine it could be done more elegant within the YAML loader.

Suggestions?

Upvotes: 6

Views: 4377

Answers (2)

FunkyPotato
FunkyPotato

Reputation: 51

The accepted answer won't work on floats (I don't know if that's your intention) and it will change booleans to string too. And it's a monkey-patch.

To do it the right way you need to subclass one of the loaders, for example yaml.SafeLoader:

import yaml


class MyLoader(yaml.SafeLoader):
    def construct_mapping(self, *args, **kwargs):
        mapping = super().construct_mapping(*args, **kwargs)

        for key in list(mapping.keys()):
            # bool is a subclass of int
            if not isinstance(key, bool) and isinstance(key, (int, float)):
                mapping[str(key)] = mapping.pop(key)

        return mapping

txt = """
---
one: 1
2: two
3.0: three
"""

yaml.load(txt, Loader=MyLoader)

That will give you:

{'one': 1, '2': 'two', '3.0': 'three'}

If you don't want it to work on floats replace (int, float) with int.

Beware that it will remove keys if one is an integer and one is a string:

txt = """
1: one
'1': another one
"""
yaml.load(txt, Loader=MyLoader)

Will give:

{'1': 'one'}

Upvotes: 1

Anthon
Anthon

Reputation: 76902

You can do this with a few lines of code, changing each mapping that is being constructed to have integer type keys converted to strings on the fly. You can subclass the SafeLoader, but then you need to register constructors. It is easiest to just patch the mapping constructor:

import yaml

def my_construct_mapping(self, node, deep=False):
    data = self.construct_mapping_org(node, deep)
    return {(str(key) if isinstance(key, int) else key): data[key] for key in data}

yaml.SafeLoader.construct_mapping_org = yaml.SafeLoader.construct_mapping
yaml.SafeLoader.construct_mapping = my_construct_mapping


yaml_str = """\
---
one: 1
2: two
"""

data = yaml.safe_load(yaml_str)
print(data)

which gives:

{'one': 1, '2': 'two'}

There is never a reason to use the default, unsafe, yaml.load() (i.e. without a Loader= parameter).

Upvotes: 3

Related Questions