Isaac
Isaac

Reputation: 363

Why is this longer list smaller than this shorter list?

Why is img_list smaller than compressed even though is smaller?

Input

print(sys.getsizeof(img_list[0:1]))
print(img_list[0:1])
print(sys.getsizeof(compressed[0:2]))
print(compressed[0:2])

print(sys.getsizeof(img_list))
print(sys.getsizeof(compressed))

img_arr = np.asanyarray(img)
print(img_arr.shape)

comp_arr = np.asarray(compressed)
print(comp_arr.shape)

Output

72
[[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [19, 19, 19], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]]
80
[[12, [0, 0, 0]], [1, [19, 19, 19]]]

256
2536

(24, 24, 3)
(306, 2)

Upvotes: 2

Views: 102

Answers (1)

sys.getsize() is deceiving. It returns the size of an object; however, it does not consider the size of that objects attributes. In other words, it does not recursively go through the object you are getting the size of. Ultimately you need to do that on your own:

import sys

l1 = [[[0, 0, 0], [0, 19, 19], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 13, 0], [0, 0, 0], [19, 19, 19], [110, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 12, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]]

l2 = [[12, [0, 0, 0]], [1, [19, 19, 19]]]

def get_size(obj, seen=None):
    """Recursively finds size of objects"""
    size = sys.getsizeof(obj)
    if seen is None:
        seen = set()
    obj_id = id(obj)
    if obj_id in seen:
        return 0
    # Important mark as seen *before* entering recursion to gracefully handle
    # self-referential objects
    seen.add(obj_id)
    if isinstance(obj, dict):
        size += sum([get_size(v, seen) for v in obj.values()])
        size += sum([get_size(k, seen) for k in obj.keys()])
    elif hasattr(obj, '__dict__'):
        size += get_size(obj.__dict__, seen)
    elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
        size += sum([get_size(i, seen) for i in obj])
    return size
print(get_size(l1))
print(get_size(l2))

Output:

5280
564

Reference: Measure the Real Size of Any Python Object

What you were doing was basically:

sys.getsizeof([[]])

and

sys.getsizeof([[], []]) # This is bigger

Upvotes: 1

Related Questions