Reputation: 29126
I have some objects that are very slow to instantiate. They are representation of data loaded from external sources such as YAML files, and loading large YAML files is slow (I don't know why).
I know these objects depends on some external factors:
Ideally I would like a transparent non boilerplate method to cache these objects if the external factors are the same:
@cache(depfiles=('foo',), depvars=(os.environ['FOO'],))
class Foo():
def __init__(*args, **kwargs):
with open('foo') as fd:
self.foo = fd.read()
self.FOO = os.environ['FOO']
self.args = args
self.kwargs = kwargs
The main idea is that the first time I instantiate Foo
, a cache file is created with the content of the object, then the next time I instantiate it (in another Python session), the cache file will be used only if none of the dependencies and argument have changed.
The solution I've found so far is based on shelve
:
import shelve
class Foo(object):
_cached = False
def __new__(cls, *args, **kwargs):
cache = shelve.open('cache')
cache_foo = cache.get(cls.__name__)
if isinstance(cache_foo, Foo):
cache_foo._cached = True
return cache_foo
self = super(Foo, cls).__new__(cls, *args, **kwargs)
return self
def __init__(self, *args, **kwargs):
if self._cached:
return
time.sleep(2) # Lots of work
self.answer = 42
cache = shelve.open('cache')
cache[self.__class__.__name__] = self
cache.sync()
It works perfectly as is but it is too boilerplate and it doesn't cover all the cases:
Is there any native solution to achieve similar behavior in Python?
Upvotes: 0
Views: 1055
Reputation: 95732
Python 3 provides the functools.lru_cache()
decorator to provide memoization of callables, but I think you're asking to preserve the caching across multiple runs of your application and by that point there is such a variety of differing requirements that you're unlikely to find a 'one size fits all' solution.
If your own answer works for you then use it. So far as 'too much boilerplate' is concerned I would extract the caching out into a separate mixin class: the first reference to Foo
in __new__
probably ought to be cls
in any case and you can use the __qualname__
attribute instead of cls.__name__
to reduce the likelihood of class name conflicts (assuming Python 3.3 or later).
Upvotes: 1