Reputation:
Why is there a difference between size of the file and the size of the bytes object that is returned after reading the file?
print(os.path.getsize(filepath))
f=open(filepath,'rb')
contents=f.read()
print(sys.getsizeof(contents))
Upvotes: 0
Views: 126
Reputation: 24691
sys.getsizeof()
returns the size of the allocated memory for contents
. Note that this returns not just the contents of the file, but also the method headers, the various internal fields in the bytes
data structure, and so on.
You can demonstrate this with other datatypes. You would expect a simple floating-point number to be 4 or 8 bytes large, but
>>> sys.getsizeof(4.5)
24
The remaining bytes are the overhead of the data structure's class information and attributes and pointers and method headers and whatnot.
You'll note that len(contents) == os.path.getsize(filepath)
- the number of bytes that the bytes
object contains - is precisely the same, as expected:
>>> print(os.path.getsize(filepath))
153
>>> f = open(filepath, 'rb')
>>> contents = f.read()
>>> type(contents)
<class 'bytes'>
>>> sys.getsizeof(contents)
186
>>> len(contents)
153
Upvotes: 6