Reputation: 1223
I have a large file as input to my python code and it will produce the corresponding output file. However, it takes too much time and I want to speed it up.
Right now, I split the large file into 1000 smaller files. I want to have a small script that will launch 1000 threads, each thread uses my original python code and has its own output file.
Can anyone give me a sample/example code?
Upvotes: 0
Views: 206
Reputation: 481
What you searching is more multiprocessing: https://docs.python.org/2/library/multiprocessing.html
Upvotes: 1
Reputation: 365717
First, using 1000 threads will almost certainly slow things down, not speed it up. Even if your code is completely I/O bound, 1000 is pushing the limits of many platforms' schedulers, and you'll spend more time context switching than doing actual work.
Next, you need to know whether your code is CPU-bound (that is, doing actual processing on information in memory) or I/O-bound (that is, waiting on things like disk reads and writes).
If your code is CPU-bound, and you can keep the CPU busy pretty consistently, you want exactly 1 thread per core. That way, you get the maximum amount of parallelism with the minimum amount of context switching (and cache thrashing, assuming most of the work is done on either immutable or non-shared values).
Also (unless that work is being done in specially-designed C extensions like numpy), you want these threads to be in separate processes, because only 1 thread per process can run the Python interpreter at a time, thanks to the Global Interpreter Lock.
So, what you want is almost certainly a process pool. The easiest way to do that is to use the concurrent.futures.ProcessPoolExecutor
, possibly with a max_workers
argument (maybe start with 16, then try tweaking it up and down to see if it helps).
If, on the other hand, your code is mostly I/O-bound, then a couple dozen threads is reasonable, especially if the delays are unpredictable, but not 1000. And threads in the same process will work fine, because one thread can run the Python interpreter while the others are all waiting for the OS to finish a disk operation.
So, in this case, you want a concurrent.futures.ThreadPoolExecutor
.
If you're not sure, and don't know how to find out, build it with a thread pool first, then use ActivityMonitor
or whatever Windows now calls its process manager or your favorite of the 300 options on Linux to watch it run; if you end up with one core at 100% and the others below 25%, then you're too CPU-bound to be using threads. Fortunately, switching to a process pool is a trivial change—replace ThreadPoolExecutor
with ProcessPoolExecutor
, and remove the max_workers
argument so Python will pick the best default, and now you're done.
In either case, the examples in the docs are good enough that there's no reason to ask for other sample code.
Upvotes: 5
Reputation: 35109
If you decided to go with multiprocessing
then you will do it in a very similar way.
You can try something like this:
import Queue
from threading import Thread
file_list = ['filea', 'fileb']
def do_stuff(q):
while True:
try:
file_name = q.get(False)
except Queue.Empty:
# Handle empty queue here
break
# do what ever you need here
print file_name
q.task_done()
q = Queue.Queue(maxsize=0)
num_threads = 2
for x in file_list:
q.put(x)
for i in range(num_threads):
worker = Thread(target=do_stuff, args=(q,))
worker.setDaemon(True)
worker.start()
q.join()
Upvotes: 1