purplecity
purplecity

Reputation: 223

Why is the memory usage of a Python list smaller than expected?

enter image description here

As seen in the picture. 50 000 000 records only take 404M memory, why? Since one record takes 83 Bytes, 50 000 000 records should take 3967M memory.

>>> import sys
>>> a=[]
>>> for it in range(5*10**7):a.append("miJ8ZNFG9iFqiQQohvyTWwqsij2rJCiZ7v"+str(it))
... 
>>> print(sys.getsizeof(a)/1024**2)
404.4306411743164
>>> print(sys.getsizeof("miJ8ZNFG9iFqiQQohvyTWwqsij2rJCiZ7v"))
83
>>> print(83*5*10**7/1024**2)
3957.7484130859375
>>> 

Upvotes: 5

Views: 238

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155363

sys.getsizeof only reports the cost of the list itself, not its contents. So you're seeing the cost of storing the list object header, plus (a little over) 50M pointers; you're likely on a 64 bit (eight byte) pointer system, thus storage for 50M pointers is ~400 MB. Getting the true size would require sys.getsizeof to be called for each object, each object's __dict__ (if applicable), etc., recursively, and it won't be 100% accurate since some of the objects (e.g. small ints) are likely shared; this is not a rabbit hole you want to go down.

Upvotes: 5

Related Questions