Reputation: 5417
I'm writing a dynamic array implementation in Python (similar to the built-in list class), for which I need to observe the growth in capacity (which doubles each time the limit is reached). For that I have the following code, but the output is weird. It looks like the sys.getsizeof()
never calls my class's __sizeof__()
. For the purpose of testing, I'm making the __sizeof__()
return 0
, but as per sys.getsizeof()
it is non-zero.
What's the catch?
import ctypes
class DynamicArray(object):
'''
DYNAMIC ARRAY CLASS (Similar to Python List)
'''
def __init__(self):
self.n = 0 # Count actual elements (Default is 0)
self.capacity = 1 # Default Capacity
self.A = self.make_array(self.capacity)
def __len__(self):
"""
Return number of elements sorted in array
"""
return self.n
def __getitem__(self,k):
"""
Return element at index k
"""
if not 0 <= k <self.n:
return IndexError('K is out of bounds!') # Check it k index is in bounds of array
return self.A[k] #Retrieve from array at index k
def append(self, ele):
"""
Add element to end of the array
"""
if self.n == self.capacity:
self._resize(2*self.capacity) #Double capacity if not enough room
self.A[self.n] = ele #Set self.n index to element
self.n += 1
def _resize(self,new_cap):
"""
Resize internal array to capacity new_cap
"""
print("resize called!")
B = self.make_array(new_cap) # New bigger array
for k in range(self.n): # Reference all existing values
B[k] = self.A[k]
self.A = B # Call A the new bigger array
self.capacity = new_cap # Reset the capacity
def make_array(self,new_cap):
"""
Returns a new array with new_cap capacity
"""
return (new_cap * ctypes.py_object)()
def __sizeof__(self):
return 0
The code used to test the resizing:
arr2 = DynamicArray()
import sys
for i in range(100):
print(len(arr2), " ", sys.getsizeof(arr2))
arr2.append(i)
And the output:
0 24
1 24
resize called!
2 24
resize called!
3 24
4 24
resize called!
5 24
6 24
7 24
8 24
resize called!
9 24
10 24
11 24
12 24
13 24
14 24
15 24
16 24
resize called!
17 24
18 24
19 24
20 24
21 24
22 24
23 24
24 24
25 24
26 24
27 24
28 24
29 24
30 24
31 24
32 24
resize called!
33 24
34 24
35 24
36 24
37 24
38 24
39 24
40 24
41 24
42 24
43 24
44 24
45 24
46 24
47 24
48 24
49 24
50 24
51 24
52 24
53 24
54 24
55 24
56 24
57 24
58 24
59 24
60 24
61 24
62 24
63 24
64 24
resize called!
65 24
66 24
67 24
68 24
69 24
70 24
71 24
72 24
73 24
74 24
75 24
76 24
77 24
78 24
79 24
80 24
81 24
82 24
83 24
84 24
85 24
86 24
87 24
88 24
89 24
90 24
91 24
92 24
93 24
94 24
95 24
96 24
97 24
98 24
99 24
Upvotes: 4
Views: 844
Reputation: 160557
Your __sizeof__
is getting called, it's just adding the garbage collector overhead to it which is why the result isn't zero.
From the docs on sys.getsizeof
:
getsizeof()
calls the object’s__sizeof__
method and adds an additional garbage collector overhead if the object is managed by the garbage collector.
Returning 0
is one way in which you make it hard for your self to understand that it's called since you'll always get the same result back (0
+ overhead).
Return a size based on the contents of the dynamic array to see it change.
To further elaborate:
Each object in CPython has some administrative information attached to it in a PyGC_head
struct that gets added:
/* add gc_head size */
if (PyObject_IS_GC(o))
return ((size_t)size) + sizeof(PyGC_Head);
return (size_t)size;
that is used by the garbage collector.
Why this is added to the overall size is probably because it does represent additional memory required by the object. On the Python level, you don't need to worry about the collection of garbage and treat it all like magic, but, when asking for information on the size of an object you should not sacrifice correct results just to keep the illusion alive.
Upvotes: 6