Reputation: 12860
I'm trying to filter a string before passing it through eval
in python. I want to limit it to math functions, but I'm not sure how to strip it with regex. Consider the following:
s = 'math.pi * 8'
I want that to basically translate to 'math.pi*8', stripped of spaces. I also want to strip any letters [A-Za-z]
that are not followed by math\.
.
So if s = 'while(1): print "hello"'
, I want any executable part of it to be stripped:
s would ideally equal something like ():""
in that scenario (all letters gone, because they were not followed by math\.
.
Here's the regex I've tried:
(?<!math\.)[A-Za-z\s]+
and the python:
re.sub(r'(?<!math\.)[A-Za-z\s]+', r'', 'math.pi * 8')
But the result is '.p*8'
, because math.
is not followed by math.
, and i
is not followed by math.
.
How can I strip letters that are not in math
and are not followed by math.
?
I followed @Thomas's answer, but also stripped square brackets, spaces, and underscores from the string, in hopes that no python function can be executed other than through the math module:
s = re.sub(r'(\[.*?\]|\s+|_)', '', s)
s = eval(s, {
'__builtins__' : None,
'math' : math
})
Upvotes: 1
Views: 115
Reputation: 6752
As @Carl says in a comment, look at what lybniz does for something better. But even this is not enough!
The technique described at the link is the following:
print eval(raw_input(), {"__builtins__":None}, {'pi':math.pi})
But this doesn't prevent something like
([x for x in 1.0.__class__.__base__.__subclasses__()
if x.__name__ == 'catch_warnings'][0]()
)._module.__builtins__['__import__']('os').system('echo hi!')
Source: Several of Ned Batchelder's posts on sandboxing, see http://nedbatchelder.com/blog/201302/looking_for_python_3_builtins.html
edit: pointed out that we don't get square brackets or spaces, so:
1.0.__class__.__base__.__subclasses__().__getitem__(i)()._module.__builtins__.get('__import__')('os').system('echo hi')
where you just try a lot of values for i.
Upvotes: 2