Reputation: 21981
Is there a way to cache python file handles? I have a function which takes a netCDF file path as input, opens it, extracts some data from the netCDF file and closes it. It gets called a lot of times, and the overhead of opening the file each time is high.
How can I make it faster by maybe caching the file handle? Perhaps there is a python library to do this
Upvotes: 6
Views: 416
Reputation: 4903
Yes, you can use following python libraries:
Let's follow the example. You have two files:
# save.py - it puts deserialized file handler object to memcached
import dill
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
file_handler = open('data.txt', 'r')
mc.set("file_handler", dill.dumps(file_handler))
print 'saved!'
and
# read_from_file.py - it gets deserialized file handler object from memcached,
# then serializes it and read lines from it
import dill
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
file_handler = dill.loads(mc.get("file_handler"))
print file_handler.readlines()
Now if you run:
python save.py
python read_from_file.py
you can get what you want.
Why it works?
Because you didn't close the file (file_handler.close()
), so object still exist in memory (has not been garbage collected, because of weakref) and you can use it. Even in different process.
Solution
import dill
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
serialized = mc.get("file_handler")
if serialized:
file_handler = dill.loads(serialized)
else:
file_handler = open('data.txt', 'r')
mc.set("file_handler", dill.dumps(file_handler))
print file_handler.readlines()
Upvotes: 3
Reputation: 4236
What about this?
filehandle = None
def get_filehandle(filename):
if filehandle is None or filehandle.closed():
filehandle = open(filename, "r")
return filehandle
You may want to encapsulate this into a class to prevent other code from messing with the filehandle
variable.
Upvotes: -1