user4936236
user4936236

Reputation:

Difference in size of a file and its contents

Why is there a difference between size of the file and the size of the bytes object that is returned after reading the file?

print(os.path.getsize(filepath))

f=open(filepath,'rb')
contents=f.read()
print(sys.getsizeof(contents))

Upvotes: 0

Views: 126

Answers (1)

Green Cloak Guy
Green Cloak Guy

Reputation: 24691

sys.getsizeof() returns the size of the allocated memory for contents. Note that this returns not just the contents of the file, but also the method headers, the various internal fields in the bytes data structure, and so on.

You can demonstrate this with other datatypes. You would expect a simple floating-point number to be 4 or 8 bytes large, but

>>> sys.getsizeof(4.5)
24

The remaining bytes are the overhead of the data structure's class information and attributes and pointers and method headers and whatnot.

You'll note that len(contents) == os.path.getsize(filepath) - the number of bytes that the bytes object contains - is precisely the same, as expected:

>>> print(os.path.getsize(filepath))
153
>>> f = open(filepath, 'rb')
>>> contents = f.read()
>>> type(contents)
<class 'bytes'>
>>> sys.getsizeof(contents)
186
>>> len(contents)
153

Upvotes: 6

Related Questions