Python - Benchmarking Disk - Write exactly x bytes in a file

Question

I am trying to benchmark my Hard drive, this is to say calculate its latency (ms) and throughput (MB/s). To do that, I want to measure the execution time of the function f.write of Python. What I need is to write exactly x bytes to my files. I understand that I need to open my file using

f = open(file_name, 'wb')

Then what I do is

for i in range(blocksize)
    f.write(b'\xff')

Howewer, the results I obtain for the throughput (MB/s) is way too low. The latency looks correct. So what I deduced is that when I do the previous lines, I am actually writing more than one byte to the file, I am writing a string containing one byte ... I know that object don't really have size in Python, but is there a way to fix this problem ?

EDIT Ok here is the new code, now the results are unexplicably too high ! The limit in writing for my disk should be 100MB/s, but I have results ten time faster. What's wrong ? import sys import time

f = open("test.txt",'wb+')

def file_write_seq_access(blocksize):
    chunk = b'\xff'*4000
    for i in range(blocksize//4000):
        f.write(chunk)

if __name__ == '__main__':
    start_time = time.time()
    file_write_seq_access(int(sys.argv[1]))
    stop_time = time.time()
    diff = stop_time - start_time 
    print diff, "s"
    print (int(sys.argv[1])/diff),"B/s"

Veedrac · Accepted Answer

Simply put, Python isn't fast enough for this kind of byte-by-byte writing, and the file buffering and similar adds too much overhead.

What you should do is chunk the operation:

import sys

blocksize = int(sys.argv[1])

chunk = b'\xff'*10000
with open("file.file", "wb") as f:
    for _ in range(blocksize // 10000):
        f.write(chunk)

Possibly using PyPy should give a further (very small, possibly negative) speed-up.

Note that the OS will interfere with timings here, so there's going to be a lot of variance. Using C might end up even faster.

After doing some timings, this matches dd for speed, so you're not going to be getting any faster.

Python - Benchmarking Disk - Write exactly x bytes in a file

Answers (2)

Related Questions