Pavel Patrin
Pavel Patrin

Reputation: 1736

Memory leaking in cpython 2.7

For example i have a code that produces many integers.

import sys
import random
a = [random.randint(0, sys.maxint) for i in xrange(10000000)]

After running it i got VIRT 350M, RES 320M (view by htop).

Then i do:

del a

But memory still is VIRT 272M, RES 242M (before producing integers was VIRT 24M, RES 6M).

The pmap of a process say that there are to big pieces of [anon] memory.

Python 3.4 does not have such behavior: memory are frees when i delete list here!

What happens? Does python leave integers in memory?

Upvotes: 1

Views: 160

Answers (1)

Seth
Seth

Reputation: 46423

Here's how I can duplicate it. If I start python 2.7, the interpreter uses about 4.5 MB of memory. (I'm quoting "Real Mem" values from the Mac OS X Activity Monitor.app).

>>> a = [random.randint(0, sys.maxint) for i in xrange(10000000)]

Now, memory usage is ~ 305.7 MB.

>>> del a

Removing a seems to have no effect on memory.

>>> import gc
>>> gc.collect()   # perform a full collection

Now, memory usage is 27.7 MB. Sometimes, the first call to collect() doesn't seem to do anything, but a second collect() call will clean things up.

But, this behavior is by design, Python isn't leaking. This old FAQ on effbot.org explains a bit more about what's happening:

“For speed”, Python maintains an internal free list for integer objects. Unfortunately, that free list is both immortal and unbounded in size. floats also use an immortal & unbounded free list.

Essentially, python is treating the integers as singletons, under the assumption that you might use them more than once.

Consider this:

# 4.5 MB    
>>> a = [object() for i in xrange(10000000)]
# 166.7 MB
>>> del a
# 9.1 MB

In this case, python it's pretty obvious that python is not keeping the objects around in memory, and removing a triggers a garbage collection which cleans everything up.

As I recall, python will actually keep low-valued integers in memory forever (0 - 1000 or so). This may explain why the gc.collect() call doesn't return as much memory as removing the list of objects.


I looked around through the PEPs a bit to figure out why Python3 is different. However, I didn't see anything obvious. If you really wanted to know, you could dig around in the source code.

Suffice to say in Python 3, it either the number-singleton behavior has changed, or the garbage collector got better.

Many things are better in Python 3.

Upvotes: 1

Related Questions