Reputation: 1595
First of all I want to mention that I know this is a horrible idea and it shouldn't be done. My intention is mainly curiosity and learning the innards of Python, and how to 'hack' them.
I was wondering whether it is at all possible to change what happens when we, for instance, use []
to create a list. Is there a way to modify how the parser behaves in order to, for instance, cause ["hello world"]
to call print("hello world")
instead of creating a list with one element?
I've attempted to find any documentation or posts about this but failed to do so.
Below is an example of replacing the built-in dict to instead use a custom class:
from __future__ import annotations
from typing import List, Any
import builtins
class Dict(dict):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.__dict__ = self
def subset(self, keys: List[Any]) -> Dict:
return Dict({key: self[key] for key in keys})
builtins.dict = Dict
When this module is imported, it replaces the dict
built-in with the Dict
class. However this only works when we directly call dict()
. If we attempt to use {}
it will fall back to the base dict
built-in implementation:
import new_dict
a = dict({'a': 5, 'b': 8})
b = {'a': 5, 'b': 8}
print(type(a))
print(type(b))
Yields:
<class 'py_extensions.new_dict.Dict'>
<class 'dict'>
Upvotes: 5
Views: 813
Reputation: 9240
The ast
module is an interface to Python's Abstract Syntax Tree which is built after parsing Python code.
It's possible to replace literal dict
({}
) with dict
call by modifying Abstract Syntax Tree of Python code.
import ast
import new_dict
a = dict({"a": 5, "b": 8})
b = {"a": 5, "b": 8}
print(type(a))
print(type(b))
print(type({"a": 5, "b": 8}))
src = """
a = dict({"a": 5, "b": 8})
b = {"a": 5, "b": 8}
print(type(a))
print(type(b))
print(type({"a": 5, "b": 8}))
"""
class RewriteDict(ast.NodeTransformer):
def visit_Dict(self, node):
# don't replace `dict({"a": 1})`
if isinstance(node.parent, ast.Call) and node.parent.func.id == "dict":
return node
# replace `{"a": 1} with `dict({"a": 1})
new_node = ast.Call(
func=ast.Name(id="dict", ctx=ast.Load()),
args=[node],
keywords=[],
type_comment=None,
)
return ast.fix_missing_locations(new_node)
tree = ast.parse(src)
# set parent to every node
for node in ast.walk(tree):
for child in ast.iter_child_nodes(node):
child.parent = node
RewriteDict().visit(tree)
exec(compile(tree, "ast", "exec"))
output;
<class 'new_dict.Dict'>
<class 'dict'>
<class 'dict'>
<class 'new_dict.Dict'>
<class 'new_dict.Dict'>
<class 'new_dict.Dict'>
Upvotes: 1
Reputation: 1643
[]
and {}
are compiled to specific opcodes that specifically return a list
or a dict
, respectively. On the other hand list()
and dict()
compile to bytecodes that search global variables for list
and dict
and then call them as functions:
import dis
dis.dis(lambda:[])
dis.dis(lambda:{})
dis.dis(lambda:list())
dis.dis(lambda:dict())
returns (with some additional newlines for clarity):
3 0 BUILD_LIST 0
2 RETURN_VALUE
5 0 BUILD_MAP 0
2 RETURN_VALUE
7 0 LOAD_GLOBAL 0 (list)
2 CALL_FUNCTION 0
4 RETURN_VALUE
9 0 LOAD_GLOBAL 0 (dict)
2 CALL_FUNCTION 0
4 RETURN_VALUE
Thus you can overwrite what dict()
returns simply by overwriting the global dict
, but you can't overwrite what {}
returns.
These opcodes are documented here. If the BUILD_MAP opcode runs, you get a dict
, no way around it. As an example, here is the implementation of BUILD_MAP in CPython, which calls the function _PyDict_FromItems. It doesn't look at any kind of user-defined classes, it specifically makes a C struct that represents a python dict
.
It is possible in at least some cases to manipulate the python bytecode at runtime. If you really wanted to make {}
return a custom class, I suppose you could write some code to search for the BUILD_MAP
opcode and replace it with the appropriate opcodes. Though those opcodes aren't the same size, so there's probably quite a few additional changes you'd have to make.
Upvotes: 3