Reputation: 191
I use a decorator to extend memoization via lru_cache to methods of objects which aren't themselves hashable (following stackoverflow.com/questions/33672412/python-functools-lru-cache-with-class-methods-release-object). This memoization works fine with python 3.6 but shows unexpected behavior on python 3.7.
Observed behavior: If the memoized method is called with keyword arguments, memoization works fine on both the python versions. If it's called without keyword arg syntax, it works on 3.6 but not on 3.7.
==> What could cause the different behavior?
The code sample below shows a minimal example which reproduces the behavior.
test_memoization_kwarg_call
passes for both python 3.6 and 3.7.
test_memoization_arg_call
passes for python 3.6 but fails for 3.7.
import random
import weakref
from functools import lru_cache
def memoize_method(func):
# From stackoverflow.com/questions/33672412/python-functools-lru-cache-with-class-methods-release-object
def wrapped_func(self, *args, **kwargs):
self_weak = weakref.ref(self)
@lru_cache()
def cached_method(*args_, **kwargs_):
return func(self_weak(), *args_, **kwargs_)
setattr(self, func.__name__, cached_method)
print(args)
print(kwargs)
return cached_method(*args, **kwargs)
return wrapped_func
class MyClass:
@memoize_method
def randint(self, param):
return random.randint(0, int(1E9))
def test_memoization_kwarg_call():
obj = MyClass()
assert obj.randint(param=1) == obj.randint(param=1)
assert obj.randint(1) == obj.randint(1)
def test_memoization_arg_call():
obj = MyClass()
assert obj.randint(1) == obj.randint(1)
Note that, weirdly, the line assert obj.randint(1) == obj.randint(1)
does not lead to a test failure in test_memoization_kwarg_call
when used in python 3.6 but fails for python 3.7 inside test_memoization_arg_call
.
Python versions: 3.6.8 and 3.7.3, respectively.
user2357112 suggested to inspect import dis; dis.dis(test_memoization_arg_call)
.
On python 3.6 this gives
36 0 LOAD_GLOBAL 0 (MyClass)
2 CALL_FUNCTION 0
4 STORE_FAST 0 (obj)
37 6 LOAD_FAST 0 (obj)
8 LOAD_ATTR 1 (randint)
10 LOAD_CONST 1 (1)
12 CALL_FUNCTION 1
14 LOAD_FAST 0 (obj)
16 LOAD_ATTR 1 (randint)
18 LOAD_CONST 1 (1)
20 CALL_FUNCTION 1
22 COMPARE_OP 2 (==)
24 POP_JUMP_IF_TRUE 30
26 LOAD_GLOBAL 2 (AssertionError)
28 RAISE_VARARGS 1
>> 30 LOAD_CONST 0 (None)
32 RETURN_VALUE
On python 3.7 this gives
36 0 LOAD_GLOBAL 0 (MyClass)
2 CALL_FUNCTION 0
4 STORE_FAST 0 (obj)
37 6 LOAD_FAST 0 (obj)
8 LOAD_METHOD 1 (randint)
10 LOAD_CONST 1 (1)
12 CALL_METHOD 1
14 LOAD_FAST 0 (obj)
16 LOAD_METHOD 1 (randint)
18 LOAD_CONST 1 (1)
20 CALL_METHOD 1
22 COMPARE_OP 2 (==)
24 POP_JUMP_IF_TRUE 30
26 LOAD_GLOBAL 2 (AssertionError)
28 RAISE_VARARGS 1
>> 30 LOAD_CONST 0 (None)
32 RETURN_VALUE
the difference being that on 3.6 the call to the cached randint
method yields LOAD_ATTR, LOAD_CONST, CALL_FUNCTION
while on 3.7 it is yields LOAD_METHOD, LOAD_CONST, CALL_METHOD
. This may explain the difference in behavior but I do not understand the internals of CPython (?) to understand it. Any ideas?
Upvotes: 14
Views: 1604
Reputation: 281012
This is a bug specifically in the Python 3.7.3 minor release. It was not present in Python 3.7.2, and it should not be present in Python 3.7.4 or 3.8.0. It was filed as Python issue 36650.
At C level, calls with no keyword arguments and calls with an empty **kwargs
dict are handled differently. Depending on details of how a function is implemented, the function may receive NULL
for kwargs instead of an empty kwargs dict. The C accelerator for functools.lru_cache
treated calls with NULL
kwargs differently from calls with an empty kwargs dict, leading to the bug you see here.
With the method cache recipe you're using, the first call to a method will always pass an empty kwargs dict to the C-level LRU wrapper, whether or not any keyword arguments were used, because of the return cached_method(*args, **kwargs)
in wrapped_func
. Subsequent calls may pass a NULL
kwargs dict, because they no longer go through wrapped_func
. This is why you could not reproduce the bug with test_memoization_kwarg_call
; the first call has to pass no keyword arguments.
Upvotes: 4
Reputation: 9946
i've never said this about python before, but this honestly looks like a bug. i have no idea why it's happening, because all this stuff is in underlying C.
but here's what i'm seeing, attempting to peer into the black box:
i added some simple printing to your code:
def memoize_method(func):
# From stackoverflow.com/questions/33672412/python-functools-lru-cache-with-class-methods-release-object
def wrapped_func(self, *args, **kwargs):
self_weak = weakref.ref(self)
print('wrapping func')
@lru_cache()
def cached_method(*args_, **kwargs_):
print('in cached_method', args_, kwargs_, id(cached_method))
return func(self_weak(), *args_, **kwargs_)
setattr(self, func.__name__, cached_method)
return cached_method(*args, **kwargs)
return wrapped_func
then i tested the function like this:
def test_memoization_arg_call():
obj = MyClass()
for _ in range(5):
print(id(obj.randint), obj.randint(1), obj.randint.cache_info(), id(obj.randint))
print()
for _ in range(5):
print(id(obj.randint), obj.randint(2), obj.randint.cache_info(), id(obj.randint))
here's the output:
==================================
wrapping func
in cached_method (1,) {} 4525448992
4521585800 668415661 CacheInfo(hits=0, misses=1, maxsize=128, currsize=1) 4525448992
in cached_method (1,) {} 4525448992
4525448992 920166498 CacheInfo(hits=0, misses=2, maxsize=128, currsize=2) 4525448992
4525448992 920166498 CacheInfo(hits=1, misses=2, maxsize=128, currsize=2) 4525448992
4525448992 920166498 CacheInfo(hits=2, misses=2, maxsize=128, currsize=2) 4525448992
4525448992 920166498 CacheInfo(hits=3, misses=2, maxsize=128, currsize=2) 4525448992
in cached_method (2,) {} 4525448992
4525448992 690871031 CacheInfo(hits=3, misses=3, maxsize=128, currsize=3) 4525448992
4525448992 690871031 CacheInfo(hits=4, misses=3, maxsize=128, currsize=3) 4525448992
4525448992 690871031 CacheInfo(hits=5, misses=3, maxsize=128, currsize=3) 4525448992
4525448992 690871031 CacheInfo(hits=6, misses=3, maxsize=128, currsize=3) 4525448992
4525448992 690871031 CacheInfo(hits=7, misses=3, maxsize=128, currsize=3) 4525448992
the interesting thing here is that it seems like it mis-caches the first positional args call. this doesn't happen with kwargs, and if you call a kwargs call first, it won't mis-cache that or any following pos args calls (which, for whatever reason, means your kwargs test is working). the important lines are this:
==================================
wrapping func
in cached_method (1,) {} 4525448992
4521585800 668415661 CacheInfo(hits=0, misses=1, maxsize=128, currsize=1) 4525448992
in cached_method (1,) {} 4525448992
4525448992 920166498 CacheInfo(hits=0, misses=2, maxsize=128, currsize=2) 4525448992
4525448992 920166498 CacheInfo(hits=1, misses=2, maxsize=128, currsize=2) 4525448992
you can see that i'm in function cached_method
with id 4525448992
twice with the exact same args/kwargs, but it's not caching. it even shows the misses themselves in CacheInfo
(first, the cache is empty. second, it can't find (1,)
for some reason). that's all in C, so i don't know how to fix it...
i guess the best answer is to use another lru_cache method and wait for the devs to fix whatever's happening here.
edit: btw, great question.
Upvotes: 1
Reputation: 1074
I have a simpler solution about the problem:
pip install methodtools
Then,
import random
from methodtools import lru_cache
class MyClass:
@lru_cache()
def randint(self, param):
return random.randint(0, int(1E9))
def test_memoization_kwarg_call():
obj = MyClass()
assert obj.randint(param=1) == obj.randint(param=1)
assert obj.randint(1) == obj.randint(1)
I am sorry that this is not the answer for "why" but if you are also intrested in fixing the problem. This is tested with 3.7.3.
Upvotes: 2