user1012037
user1012037

Reputation: 501

Python program using too much memory

I got these results from Heapy, but it's unclear what exactly they mean.

 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 262539  59 36961284  48  36961284  48 dict (no owner)
     1  65536  15 34340864  45  71302148  93 dict of myobj.Container
     2  65536  15  2097152   3  73399300  96 myobj.Container

myobj is a class with about 20 True/False values and 20 number values (all of which could by stored in 2 bytes).

I have an array of 256*256 of them. I really don't see why they consume 35 or 70 MB of memory. I'd like to bring it below 10 MB if possible.

Much of the data inside the object is organized into dictionaries for ease of access. The dictionaries themselves don't change and are rather pointless. Would they cause much overhead?

Would it be beneficial to pack all the data into 1 number with bitwise operators? I should be able to store the entire data of the object in 32 or 64 bytes. I was hoping the compiler would do this sort of thing automatically like other languages, but it seems to be doing the opposite.

The class inherits builtin type object for no reason other than to use decorators. Would this cause much overhead?

Also curious what "dict (no owner)" means and what it's consuming the other half of the memory.

Edit: sys.getsizeof(myobj.Container) is indeed reporting 450 bytes! This is insane. I only used dictionaries because I need to access data based on an index. As far as I'm concerned the compiler should get rid of the structures and access the values directly. Is there a better way to do that? (I don't imagine Lists are the answer)

Upvotes: 4

Views: 3223

Answers (1)

Winston Ewert
Winston Ewert

Reputation: 45039

Python doesn't eliminate the overhead of structures like that. Sorry. Its dynamic nature makes such compiler optimizations difficult. But then I don't know any language that would eliminate the overhead introduced by keeping things in dictionaries.

dict (no owner) probably includes all the dictionaries you are creating inside your object. They are marked as no-owner because they aren't dictionaries for object instances.

What you can do:

Use __slots__, if you add __slots__ = ('the','names','of','fields') as a class attribute, python will use a more efficient implementation of the class. It will get rid of the dictionary used to hold the attributes.

If your dictionaries can be rewritten to use lists that would improve the situation. Lists are more memory efficient then dictionaries.

For the best efficiency you should rework your system to use numpy arrays. Each attribute in your class would become a 256*256 sized array. Each element will be stored very efficiently space-wise in that case.

Alternatively, you could checkout PyPy. It provides an alternate implementation of python with a JIT as well as various time/space optimizations which might help.

sys.getsizeof isn't reporting what you think its reporting. sys.getsizeof(myobj.Container) reports the size of the class object, not the size of actual Container objects. You want sys.getsizeof(myobj.Container()) or similar. Even that's no accurate because it doesn't include anything besides the base object. It doesn't take into account the dictionary holding the attributes. It'll only report the size of the third row in your report.

Upvotes: 6

Related Questions