WGH
WGH

Reputation: 3509

Make @lru_cache ignore some of the function arguments

How can I make @functools.lru_cache decorator ignore some of the function arguments with regard to caching key?

For example, I have a function that looks like this:

def find_object(db_handle, query):
    # (omitted code)
    return result

If I apply lru_cache decorator just like that, db_handle will be included in the cache key. As a result, if I try to call the function with the same query, but different db_handle, it will be executed again, which I'd like to avoid. I want lru_cache to consider query argument only.

Upvotes: 93

Views: 24542

Answers (3)

Raymond Hettinger
Raymond Hettinger

Reputation: 226306

I want lru_cache to consider query argument only.

Apply the lru_cache to an auxiliary function that takes only the query argument. That way the cache only sees the one relevant argument, the one that uniquely identifies a single cached result.

Next, write a new function that accepts both arguments, saving the first argument in a place where the auxiliary function can get to it and using the second to call auxiliary function directly:

def original_function(db_handle, query):
    "Unmodified original function."
    result = len(query)                 # Simulate a computation
    print('Running', query, 'with', db_handle, 'giving', result)
    return result

context = {'db_handle': None}

@lru_cache
def auxiliary(query):
    db_handle = context['db_handle'] 
    return original_function(db_handle, query)

def new_function(db_handle, query):
    context['db_handle'] = db_handle
    return auxiliary(query)

a = new_function('h1', 'first_query')   # Executes new query
b = new_function('h2', 'first_query')   # Cached query
c = new_function('h2', 'second_query')  # Executes new query

Upvotes: 2

Yann
Yann

Reputation: 4181

With cachetools you can write:

from cachetools import cached
from cachetools.keys import hashkey

from random import randint

@cached(cache={}, key=lambda db_handle, query: hashkey(query))
def find_object(db_handle, query):
    print("processing {0}".format(query))
    return query

queries = list(range(5))
queries.extend(range(5))
for q in queries:
    print("result: {0}".format(find_object(randint(0, 1000), q)))

You will need to install cachetools (pip install cachetools).

The syntax is:

@cached(
    cache={},
    key=lambda <all-function-args>: hashkey(<relevant-args>)
)

Here is another example that includes keyword args:

@cached(
    cache={},
    key=lambda a, b, c=1, d=2: hashkey(a, c)
)
def my_func(a, b, c=1, d=2):
    return a + c

In the example above note that the lambda function input args match the my_func args. You don't have to exactly match the argspec if you don't need to. For example, you can use kwargs to squash out things that aren't needed in the hashkey:

@cached(
    cache={},
    key=lambda a, b, c=1, **kwargs: hashkey(a, c)
)
def my_func(a, b, c=1, d=2, e=3, f=4):
    return a + c

In the above example we don't care about d=, e= and f= args when looking up a cache value, so we can squash them all out with **kwargs.

Upvotes: 81

WGH
WGH

Reputation: 3509

I have at least one very ugly solution. Wrap db_handle in a object that's always equals, and unwrap it inside the function.

It requires a decorator with quite a bit of helper functions, which makes stack trace quite confusing.

class _Equals(object):
    def __init__(self, o):
        self.obj = o

    def __eq__(self, other):
        return True

    def __hash__(self):
        return 0

def lru_cache_ignoring_first_argument(*args, **kwargs):
    lru_decorator = functools.lru_cache(*args, **kwargs)

    def decorator(f):
        @lru_decorator
        def helper(arg1, *args, **kwargs):
            arg1 = arg1.obj
            return f(arg1, *args, **kwargs)

        @functools.wraps(f)
        def function(arg1, *args, **kwargs):
            arg1 = _Equals(arg1)
            return helper(arg1, *args, **kwargs)

        return function

    return decorator

Upvotes: 15

Related Questions