Zubo
Zubo

Reputation: 1593

Python: YAML dictionary of functions: how to load without converting to strings

I have a YAML config file, which contains a dictionary, like so:

"COLUMN_NAME": column_function

It maps strings to functions (which exist and are supposed to be called).

However, when I load it using yaml, I see that the loaded dictionary now maps strings to strings:

'COLUMN_NAME': 'column_function' 

Now I cannot use it as intended - 'column_function' doesn't point to column_function.

What would be a good way to load my dict so that it maps to my functions? After searching and reading a bit on this issue, I'm very cautious about using eval or something like that, since the config file is user-edited.

I think this thread is about my issue, but I'm not sure on the best way to approach it.

Should I look up the string in them for each of key-value pairs in my config dict? Is this a good way:

for (key, val) in STRING_DICTIONARY.items():
    try: 
        STRING_DICTIONARY[key] = globals()[val]   
    except KeyError:
        print("The config file specifies a function \"" + val 
               + "\" for column \"" + key 
               + "\". No such function is defined, however. ")

Upvotes: 8

Views: 6199

Answers (1)

Anthon
Anthon

Reputation: 76912

To lookup a name val and evaluate it in a generic way I would use the following:

def fun_call_by_name(val):
    if '.' in val:
        module_name, fun_name = val.rsplit('.', 1)
        # you should restrict which modules may be loaded here
        assert module_name.startswith('my.')
    else:
        module_name = '__main__'
        fun_name = val
    try:
        __import__(module_name)
    except ImportError as exc:
        raise ConstructorError(
            "while constructing a Python object", mark,
            "cannot find module %r (%s)" % (utf8(module_name), exc), mark)
    module = sys.modules[module_name]
    fun = getattr(module, fun_name)
    return fun()

This is adapted from ruamel.yaml.constructor.py:find_python_name(), used there to create objects from string scalars. If the val that is handed in, contains a dot, it will assume you are looking up a function name in another module.

But I wouldn't magically interpret values from your top-level dictionary. YAML has a tagging mechanism (and for specific tags the find_python_name() method comes into action, to control the type of instances that are created).
If you have any control over how the YAML file looks like, use tags to selectively not create a string, as in this file input.yaml:

COLUMN_NAME: !fun column_function    # tagged
PI_VAL: !fun my.test.pi              # also tagged
ANSWER: forty-two                    # this one has no tag

Assuming a subdirectory my with a file test.py with contents:

import math
def pi():
    return math.pi

You can use:

import sys
import ruamel.yaml

def column_function():
    return 3

def fun_constructor(loader, node):
    val = loader.construct_scalar(node)
    return fun_call_by_name(val)

# add the constructor for the tag !fun
ruamel.yaml.add_constructor('!fun', fun_constructor, Loader=ruamel.yaml.RoundTripLoader)

with open('input.yaml') as fp:
    data = ruamel.yaml.round_trip_load(fp)
assert data['COLUMN_NAME'] == 3
ruamel.yaml.round_trip_dump(data, sys.stdout)

to get:

COLUMN_NAME: 3                       # tagged
PI_VAL: 3.141592653589793            # also tagged
ANSWER: forty-two                    # this one has no tag

If you don't care about dumping data as YAML with comments preserved, you can use SafeLoader and safe_load() instead of RoundTripLoader resp. round_trip_loader().

Upvotes: 2

Related Questions