Reputation: 186562
Using Yuri Nudelman's code with the custom _import
definition to specify modules to restrict serves as a good base but when calling functions within said user_code
naturally due to having to whitelist everything is there any way to permit other user defined functions to be called? Open to other sandboxing solutions although Jupyter didn't seem straight-forward to embed within a web interface.
from RestrictedPython import safe_builtins, compile_restricted
from RestrictedPython.Eval import default_guarded_getitem
def _import(name, globals=None, locals=None, fromlist=(), level=0):
safe_modules = ["math"]
if name in safe_modules:
globals[name] = __import__(name, globals, locals, fromlist, level)
else:
raise Exception("Don't you even think about it {0}".format(name))
safe_builtins['__import__'] = _import # Must be a part of builtins
def execute_user_code(user_code, user_func, *args, **kwargs):
""" Executed user code in restricted env
Args:
user_code(str) - String containing the unsafe code
user_func(str) - Function inside user_code to execute and return value
*args, **kwargs - arguments passed to the user function
Return:
Return value of the user_func
"""
def _apply(f, *a, **kw):
return f(*a, **kw)
try:
# This is the variables we allow user code to see. @result will contain return value.
restricted_locals = {
"result": None,
"args": args,
"kwargs": kwargs,
}
# If you want the user to be able to use some of your functions inside his code,
# you should add this function to this dictionary.
# By default many standard actions are disabled. Here I add _apply_ to be able to access
# args and kwargs and _getitem_ to be able to use arrays. Just think before you add
# something else. I am not saying you shouldn't do it. You should understand what you
# are doing thats all.
restricted_globals = {
"__builtins__": safe_builtins,
"_getitem_": default_guarded_getitem,
"_apply_": _apply,
}
# Add another line to user code that executes @user_func
user_code += "\nresult = {0}(*args, **kwargs)".format(user_func)
# Compile the user code
byte_code = compile_restricted(user_code, filename="<user_code>", mode="exec")
# Run it
exec(byte_code, restricted_globals, restricted_locals)
# User code has modified result inside restricted_locals. Return it.
return restricted_locals["result"]
except SyntaxError as e:
# Do whaever you want if the user has code that does not compile
raise
except Exception as e:
# The code did something that is not allowed. Add some nasty punishment to the user here.
raise
i_example = """
import math
def foo():
return 7
def myceil(x):
return math.ceil(x)+foo()
"""
print(execute_user_code(i_example, "myceil", 1.5))
Running this returns 'foo' is not defined
Upvotes: 6
Views: 1874
Reputation: 27214
First of all, the replacement for the __import__
built-in is implemented incorrectly. That built-in is supposed to return the imported module, not mutate the globals to include it:
Python 3.9.12 (main, Mar 24 2022, 13:02:21)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> __import__('math')
<module 'math' (built-in)>
>>> math
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
A better way to reimplement __import__
would be this:
_SAFE_MODULES = frozenset(("math",))
def _safe_import(name, *args, **kwargs):
if name not in _SAFE_MODULES:
raise Exception(f"Don't you even think about {name!r}")
return __import__(name, *args, **kwargs)
The fact that you mutated globals in your original implementation was partially masking the primary bug. Namely: name assignments within restricted code (function definitions, variable assignments and imports) mutate the locals dict, but name look-ups are by default done as global look-ups, bypassing the locals entirely. You can see this by disassembling the restricted bytecode using __import__('dis').dis(byte_code)
:
2 0 LOAD_CONST 0 (0)
2 LOAD_CONST 1 (None)
4 IMPORT_NAME 0 (math)
6 STORE_NAME 0 (math)
4 8 LOAD_CONST 2 (<code object foo at 0x7fbef4eef3a0, file "<user_code>", line 4>)
10 LOAD_CONST 3 ('foo')
12 MAKE_FUNCTION 0
14 STORE_NAME 1 (foo)
7 16 LOAD_CONST 4 (<code object myceil at 0x7fbef4eef660, file "<user_code>", line 7>)
18 LOAD_CONST 5 ('myceil')
20 MAKE_FUNCTION 0
22 STORE_NAME 2 (myceil)
24 LOAD_CONST 1 (None)
26 RETURN_VALUE
Disassembly of <code object foo at 0x7fbef4eef3a0, file "<user_code>", line 4>:
5 0 LOAD_CONST 1 (7)
2 RETURN_VALUE
Disassembly of <code object myceil at 0x7fbef4eef660, file "<user_code>", line 7>:
8 0 LOAD_GLOBAL 0 (_getattr_)
2 LOAD_GLOBAL 1 (math)
4 LOAD_CONST 1 ('ceil')
6 CALL_FUNCTION 2
8 LOAD_FAST 0 (x)
10 CALL_FUNCTION 1
12 LOAD_GLOBAL 2 (foo)
14 CALL_FUNCTION 0
16 BINARY_ADD
18 RETURN_VALUE
The documentation for exec
explains (emphasis mine):
If only globals is provided, it must be a dictionary (and not a subclass of dictionary), which will be used for both the global and the local variables. If globals and locals are given, they are used for the global and local variables, respectively. If provided, locals can be any mapping object. Remember that at the module level, globals and locals are the same dictionary. If exec gets two separate objects as globals and locals, the code will be executed as if it were embedded in a class definition.
This makes separate mappings for locals and globals completely spurious. We can therefore simply get rid of the locals dict, and put everything in globals. The entire code should look something like this:
from RestrictedPython import safe_builtins, compile_restricted
_SAFE_MODULES = frozenset(("math",))
def _safe_import(name, *args, **kwargs):
if name not in _SAFE_MODULES:
raise Exception(f"Don't you even think about {name!r}")
return __import__(name, *args, **kwargs)
def execute_user_code(user_code, user_func, *args, **kwargs):
my_globals = {
"__builtins__": {
**safe_builtins,
"__import__": _safe_import,
},
}
try:
byte_code = compile_restricted(
user_code, filename="<user_code>", mode="exec")
except SyntaxError:
# syntax error in the sandboxed code
raise
try:
exec(byte_code, my_globals)
return my_globals[user_func](*args, **kwargs)
except BaseException:
# runtime error (probably) in the sandboxed code
raise
Above I also managed to fix a couple of tangential issues:
user_func
happens to come from an untrusted source, and avoids having to inject args
, kwargs
and result
into the sandbox, which would enable sandboxed code to clobber it.safe_builtins
object provided by the RestrictedPython
module. Otherwise, if any other code within your program happens to be using RestrictedPython
, it might have been affected.BaseException
, to also catch cases when sandboxed code attempts to raise KeyboardInterrupt
or SystemExit
(which do not derive from Exception
, but only BaseException
)._getitem_
and _apply_
, which don’t seem to be used for anything. If they turn out to be necessary after all, you may restore them.(Note, however, that this still does not protect against DoS via infinite loops within the sandbox.)
Upvotes: 7
Reputation: 43083
In exec(byte_code, restricted_globals, restricted_locals)
:
def foo():
modifies restricted_locals
.myceil
can only see its globals, i.e. myceil.__globals__
, which is restricted_globals
.You can update f.__globals__
/restricted_globals
in _apply
:
def _apply(f, *a, **kw):
for k, v in restricted_locals.items():
if k not in _restricted_locals_keys and k not in f.__globals__:
f.__globals__[k] = v
return f(*a, **kw)
where _restricted_locals_keys
is:
restricted_locals = {
"result": None,
"args": args,
"kwargs": kwargs,
}
_restricted_locals_keys = set(restricted_locals.keys())
If you don't want to modify restricted_globals
, then copy f
with new globals in _apply
:
import copy
import types
def _copy_func(f, func_globals=None):
func_globals = func_globals or f.__globals__
return types.FunctionType(f.__code__, func_globals, name=f.__name__, argdefs=f.__defaults__, closure=f.__closure__)
def _apply(f, *a, **kw):
func_globals = copy.copy(f.__globals__)
for k, v in restricted_locals.items():
if k not in _restricted_locals_keys and k not in func_globals:
func_globals[k] = v
f = _copy_func(f, func_globals=func_globals)
return f(*a, **kw)
Upvotes: 2