robince
robince

Reputation: 10967

data cache for python package

I have a python module which generates large data files which I want to cache on disk for future use. The cache is likely to end up some hundreds of MB for a normal user, but save a lot of computation time.

The files aren't distributed with the module, but are generated the first time the code is run with a given set of parameters.

So far I've just been using a single file module myself and putting them in a hardcoded path relative to the module (data/). But I now need to distribute this module in a Python package with distutils and I was wondering if there is a standard way to do that.

I was thinking of something like the compiled cache of scipy.weave - but wondering if there is a more modern supported way of doing it. On *nix platforms I would expect it to go in ~/.something but I'm not sure what the windows equivalent would be. Also this should configurable so that users can point it somewhere else if it's more convenient, or to share the cache dir between users. How should such a config file work? Where should it go?

Or should I just have it as an install option, either through a config file next to setup.py or set by manually editing setup.py, then hard code the directory in the module before installation?

Any pointers greatfully received...

Upvotes: 4

Views: 645

Answers (2)

merwok
merwok

Reputation: 6907

There is an emerging standard in the free OS world: http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html

This module can help you for Windows and Max OS X, but it seems to be broken with respect the the XDG Base Dir Spec: http://pypi.python.org/pypi/appdirs

Upvotes: 2

Ned Batchelder
Ned Batchelder

Reputation: 375754

You can use the standard library module ConfigParser to parse an ini file (or .rc file depending on your culture). To find the file, os.path.expanduser is a useful function that does the right thing on all platforms for paths like "~/.mytoolrc". To let the user override the location of things, you can use environment variables via os.environ.

Upvotes: 3

Related Questions