Slug Pue
Slug Pue

Reputation: 256

Threads writing each their own file is slower than writing all files sequentially

I am learning about threading in Python, and wrote a short test program which creates 10 csv-files and writes 100k lines in each of the files. I assumed it would be faster to let 10 threads write each their own file, but for some reason it is 2x slower than simply writing all files in sequence.

I think this might have to do with the way the threading is treated by the OS, but not sure. I am running this on Linux.

I would greatly appreciate if someone could shed some light on why this is the case.

Multi-thread version:

import thread, csv


N = 10  #number of threads

exitmutexes = [False]*N

def filewriter(id_):
    with open('files/'+str(id_)+'.csv', 'wb') as f:
        writer = csv.writer(f, delimiter=',')
        for i in xrange(100000):
            writer.writerow(["efweef", "wefwef", "666w6efw", "6555555"])
    exitmutexes[id_] = True

for i in range(N):
    thread.start_new_thread(filewriter, (i,))

while False in exitmutexes: #checks whether all threads are done
    pass

Note: I have tried to include a sleep in the while-loop so that main thread is free at intervals, but this had no substantial effect.

Regular version:

import time, csv


for i in range(10):
    with open('files2/'+str(i)+'.csv', 'wb') as f:
        writer = csv.writer(f, delimiter=',')
        for i in xrange(100000):
            writer.writerow(["efweef", "wefwef", "666w6efw", "6555555"])

Upvotes: 0

Views: 573

Answers (2)

Lie Ryan
Lie Ryan

Reputation: 64827

There are several issues:

  1. Due to Global Interpreter Lock (GIL), Python will not use more than one CPU core at a time for the data generation part, so your data generation won't be sped up by running multiple threads. You'll need multi processing to improve CPU bound operation.
  2. But that's not really the core of the problem here, because the GIL is released when you do I/O like writing to disk. The core of the problem is that you're writing to ten different places at a time, which most likely causes the harddisk head to thrash around as the hard disk head switches around between ten different places in the disk. Serial writes is almost always fastest in a hard disk.
  3. Even if you have CPU bound operation and use multiprocessing, using ten thread won't give you any significant advantage in data generation unless you actually have ten CPU cores. If you use more threads than the number of CPU cores, you'll pay the cost of thread switching, but you'll never speed up the total runtime of a CPU bound operation.

If you use more threads than available CPU, the total run time always increases or at most stay the same. The only reason to use more threads than CPU cores is if you are consuming the result of the threads interactively or in a pipeline with other systems. There are edge cases where you can speed up a poorly designed, I/O bound program by using threads. But a well designed single thread program will most likely perform just as well or better.

Upvotes: 4

Tomasz Kaminski
Tomasz Kaminski

Reputation: 910

Sounds like the dreaded GIL (Global Interpreter Lock)

"In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)"

This essentially means each python interpreter (and thus script) is locked to one logical core on your machine, and no two threads will be executed simultaneously, unless you decide to spawn to separate processes.

Consult this page for more details: https://wiki.python.org/moin/GlobalInterpreterLock

Upvotes: 1

Related Questions