Reputation: 405
I'm trying to get a sense of how python's classes are implemented and how much memory would be consumed by allocating to them. So I created a big numpy array and then assigned it to a class and then assigned that class to another class (below).
import numpy as np
class Foo(object):
def __init__(self, x=0):
self.x = x
class Bar(object):
def __init__(self, x=None):
self.x = x
x = np.random.normal(0, 1, (50000, 10))
a = Foo(x)
b = Bar(a)
Using sys.getsizeof doesn't seem to help getting the memory size for numpy arrays. Numpy arrays use nbytes, but the class referring to the numpy array doesn't have nbytes as a method.
If I make a change to x
, then a.x
and b.x.x
automatically update to reflect it. Python documentation notes objects aliasses are like pointers. Am I right that a.x
and b.x.x
can be thought of more like pointers to the original x
? As pointer-like, the memory footprint of a
and b
should thus not be related to the underlying size of x
. Is this correct?
Upvotes: 3
Views: 3905
Reputation: 80111
Assigning these objects to numpy
will give you object references which only represent the pointer size so they're not related to the actual size of the object.
As for sys.getsizeof()
, that will get you the size of the numpy object description but not all of the subobjects so it's most certainly inaccurate for this case. My guess is that the actual size in this case would be sys.getsizeof(x) + x.nbytes
.
As for a
storing x
, yes, by default Python only copies references, not the actual values.
To illustrate this behaviour:
x = []
y = x
x.append('SPAM')
print y
# Returns: ['SPAM']
A nice module to analyze memory usage is pympler
Upvotes: 4