Reputation: 125
I am attempting to have two long running operations run simultaneously in python. They both operate on the same data set, but do not modify it. I have found that a threaded implementation runs slower than simply running them one after the other.
I have created a simplified example to show what I am experiencing.
Running this code, and commenting line 46 (causing it to perform the operation threaded), results in a runtime on my machine of around 1:01 (minute:seconds). I see two CPUs run at around 50% for the full run time.
Commenting out line 47 (causing sequential calculations) results in a runtime of around 35 seconds, with 1 CPU being pegged at 100% for the full runtime.
Both runs result in the both full calculations being completed.
from datetime import datetime
import threading
class num:
def __init__(self):
self._num = 0
def increment(self):
self._num += 1
def getValue(self):
return self._num
class incrementNumber(threading.Thread):
def __init__(self, number):
self._number = number
threading.Thread.__init__(self)
def run(self):
self.incrementProcess()
def incrementProcess(self):
for i in range(50000000):
self._number.increment()
def runThreaded(x, y):
x.start()
y.start()
x.join()
y.join()
def runNonThreaded(x, y):
x.incrementProcess()
y.incrementProcess()
def main():
t = datetime.now()
x = num()
y = num()
incrementX = incrementNumber(x)
incrementY = incrementNumber(y)
runThreaded(incrementX, incrementY)
#runNonThreaded(incrementX, incrementY)
print x.getValue(), y.getValue()
print datetime.now() - t
if __name__=="__main__":
main()
Upvotes: 1
Views: 1462
Reputation: 28846
CPython has a so-called Global Interpreter Lock, which means that only one Python statement can run at a time even when multithreading. You might want to look into multiprocessing, which avoids this constraint.
The GIL means that Python multithreading is only useful for I/O-bound operations, other things that wait for stuff to happen, or if you're calling a C extension that releases the GIL while doing work.
Upvotes: 5