Reputation: 15359
In Python 2, I would like to evaluate a string that contains the representation of a literal. I would like to do this safely, so I don't want to use eval()
—instead I've become accustomed to using ast.literal_eval()
for this kind of task.
However, I also want to evaluate under the assumption that string literals in plain quotes denote unicode
objects—i.e. the kind of forward-compatible behavior you get with from __future__ import unicode_literals
. In the example below, eval()
seems to respect this preference, but ast.literal_eval()
seems not to.
from __future__ import unicode_literals, print_function
import ast
raw = r""" 'hello' """
value = eval(raw.strip())
print(repr(value))
# Prints:
# u'hello'
value = ast.literal_eval(raw.strip())
print(repr(value))
# Prints:
# 'hello'
Note that I'm looking for a general-purpose literal_eval
replacement—I don't know in advance that the output is necessarily a string object. I want to be able to assume that raw
is the representation of an arbitrary Python literal, which may be a string, or may contain one or more strings, or not.
Is there a way of getting the best of both worlds: a function that both securely evaluates representations of arbitrary Python literals and respects the unicode_literals
preference?
Upvotes: 3
Views: 129
Reputation: 106946
What makes codes potentially insecure are references to names and/or attributes. You can subclass ast.NodeVisitor
to ensure that there are no such references in a given piece of code before you eval
it:
import ast
from textwrap import dedent
class Validate(ast.NodeVisitor):
def visit_Name(self, node):
raise ValueError("Reference to name '%s' found in expression" % node.id)
def visit_Attribute(self, node):
raise ValueError("Reference to attribute '%s' found in expression" % node.attr)
Validate().visit(ast.parse(dedent(raw), '<inline>', 'eval'))
eval(raw)
Upvotes: 1
Reputation: 281594
Neither ast.literal_eval
nor ast.parse
offer the option to set compiler flags. You can pass the appropriate flags to compile
to parse the string with unicode_literals
activated, then run ast.literal_eval
on the resulting node:
import ast
# Not a future statement. This imports the __future__ module, and has no special
# effects beyond that.
import __future__
unparsed = '"blah"'
parsed = compile(unparsed,
'<string>',
'eval',
ast.PyCF_ONLY_AST | __future__.unicode_literals.compiler_flag)
value = ast.literal_eval(parsed)
Upvotes: 4
Reputation: 363183
An interesting question. I'm not sure if there is any solution for ast.literal_eval
, here, but I offer a cheap/safe workaround:
def my_literal_eval(s):
ast.literal_eval(s)
return eval(s)
Upvotes: 4