Reputation: 21
I am wondering why Python's mmap() performance going down with time? I mean I have a little app which make changes to N files, if set is big (not too really big, say 1000) first 200 is demon-speed but after that it goes slower and slower. It looks like I should free memory once in a while but don't know how and most importantly why Python do not do this automagically.
Any help?
-- edit --
It's something like that:
def function(filename, N):
fd = open(filename, 'rb+')
size = os.path.getsize(filename)
mapped = mmap(fd.fileno(), size)
for i in range(N):
some_operations_on_mmaped_block()
mapped.close()
Upvotes: 2
Views: 1701
Reputation: 6182
Your OS caches the mmap'd pages in RAM. Reads and writes go at RAM speed from the cache. Dirty pages are eventually flushed. On Linux performance will be great until you have to start flushing pages, this is controlled by vm.dirty_ratio sysctl variable. Once your start flushing dirty pages to disk the reads will compete with the writes on your busy IO bus/device. Another thing to consider is simply whether your OS has enough RAM to cache all the files (the buffers counter in top output). So I would watch the output of "vmstat 1" while your program runs and watch the cache / buff counters go up until suddenly you start doing IO.
Upvotes: 6