Display Name
Display Name

Reputation: 219

Does Python's os.path.getsize() have true byte resolution?

File systems rarely allow files to be arbitrary numbers of bytes long, instead preferring to pad them to fit in a certain number of blocks. Python's os.path.getsize() is documented to return a size in units of bytes, but I am not sure whether or not it is rounded by the OS (linux, in my case) or filesystem, to a block size. For my application it is necessary that I know the exact number of bytes that I will be able to read out of a large file (~1GB). What guarantees are made about this?

Upvotes: 0

Views: 1216

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121654

No guarantees are made by Python. The os.path.getsize() function returns the st_size field of a os.stat() call. This is a direct call to the stat system call.

All the documentation for stat simply names st_size as the file size, in bytes.

On my Debian test system stat gives true filesizes:

$ stat -fc %s .   # fs block size
4096
$ head -c 2048 < /dev/urandom > 2kb
$ head -c 6168 < /dev/urandom > 6kb
$ head -c 12345 < /dev/urandom > 12andabitkb
$ ls --block-size=1 -s *kb     # block use in bytes
16384 12andabitkb   4096 2kb   8192 6kb
$ ls --block-size=4K -s *kb    # block count per file
4 12andabitkb  1 2kb  2 6kb
$ python3 -c 'import os, glob; print(*("{:<11} {}".format(f, os.path.getsize(f)) for f in glob.glob("*kb")), sep="\n")'
2kb         2048
12andabitkb 12345
6kb         6168

Upvotes: 2

Related Questions