Reputation: 505
The following code fills all my memory:
from sys import getsizeof
import numpy
# from http://stackoverflow.com/a/2117379/272471
def getSize(array):
return getsizeof(array) + len(array) * getsizeof(array[0])
class test():
def __init__(self):
pass
def t(self):
temp = numpy.zeros([200,100,100])
A = numpy.zeros([200], dtype = numpy.float64)
for i in range(200):
A[i] = numpy.sum( temp[i].diagonal() )
return A
a = test()
memory_usage("before")
c = [a.t() for i in range(100)]
del a
memory_usage("After")
print("Size of c:", float(getSize(c))/1000.0)
The output is:
('>', 'before', 'memory:', 20588, 'KiB ')
('>', 'After', 'memory:', 1583456, 'KiB ')
('Size of c:', 8.92)
Why am I using ~1.5 GB of memory if c is ~ 9 KiB? Is this a memory leak? (Thanks)
The memory_usage
function was posted on SO and is reported here for clarity:
def memory_usage(text = ''):
"""Memory usage of the current process in kilobytes."""
status = None
result = {'peak': 0, 'rss': 0}
try:
# This will only work on systems with a /proc file system
# (like Linux).
status = open('/proc/self/status')
for line in status:
parts = line.split()
key = parts[0][2:-1].lower()
if key in result:
result[key] = int(parts[1])
finally:
if status is not None:
status.close()
print('>', text, 'memory:', result['rss'], 'KiB ')
return result['rss']
Upvotes: 2
Views: 4481
Reputation: 505
The implementation of diagonal()
failed to decrement a reference counter. This issue had been previously fixed, but the change didn't make it into 1.7.0.
Upgrading to 1.7.1 solves the problem! The release notes contain various useful identifiers, notably issue 2969.
The solution was provided by Sebastian Berg and Charles Harris on the NumPy mailing list.
Upvotes: 6
Reputation: 1015
I don't think sys.getsizeof returns what you expect
your numpy vector A is 64 bit (8 bytes) - so it takes up (at least)
8 * 200 * 100 * 100 * 100 / (2.0**30) = 1.5625 GB
so at minimum you should use 1.5 GB on the 100 arrays, the last few hundred mg are all the integers used for indexing the large numpy data and the 100 objects
It seems that sys.getsizeof always returns 80 no matter how large a numpy array is:
sys.getsizeof(np.zeros([200,1000,100])) # return 80
sys.getsizeof(np.zeros([20,100,10])) # return 80
In your code you delete a which is a tiny factory object who's t method return huge numpy arrays, you store these huge arrays in a list called c. try to delete c, then you should regain most of your RAM
Upvotes: 0
Reputation: 91149
Python allocs memory from the OS if it needs some.
If it doesn't need it any longer, it may or may not return it again.
But if it doesn't return it, the memory will be reused on subsequent allocations. You should check that; but supposedly the memory consumption won't increase even more.
About your estimations of memory consumption: As azorius already wrote, your temp
array consumes 16 MB, while your A
array consumes about 200 * 8 = 1600 bytes (+ 40 for internal reasons). If you take 100 of them, you are at 164000 bytes (plus some for the list).
Besides that, I have no explanation for the memory consumption you have.
Upvotes: 1