Reputation: 223
As seen in the picture. 50 000 000 records only take 404M memory, why? Since one record takes 83 Bytes, 50 000 000 records should take 3967M memory.
>>> import sys
>>> a=[]
>>> for it in range(5*10**7):a.append("miJ8ZNFG9iFqiQQohvyTWwqsij2rJCiZ7v"+str(it))
...
>>> print(sys.getsizeof(a)/1024**2)
404.4306411743164
>>> print(sys.getsizeof("miJ8ZNFG9iFqiQQohvyTWwqsij2rJCiZ7v"))
83
>>> print(83*5*10**7/1024**2)
3957.7484130859375
>>>
Upvotes: 5
Views: 238
Reputation: 155363
sys.getsizeof
only reports the cost of the list
itself, not its contents. So you're seeing the cost of storing the list
object header, plus (a little over) 50M pointers; you're likely on a 64 bit (eight byte) pointer system, thus storage for 50M pointers is ~400 MB. Getting the true size would require sys.getsizeof
to be called for each object, each object's __dict__
(if applicable), etc., recursively, and it won't be 100% accurate since some of the objects (e.g. small int
s) are likely shared; this is not a rabbit hole you want to go down.
Upvotes: 5