Reputation: 189
Lets assume we have a next task (very abstract):
We have a folder with various amount of files to process (files count maybe 1, 2, or few thousands). Each file can be processed only sequentially (it means that it is impossible to read whole file in memory and process it in several threads). The result of file processing should be generated new file, also written sequentially. How to do it using all available CPU cores?
I see only two approaches:
Use Task Queue which is processed by several threads. Where each task is processing single file, like read chunk from file, process chunk, write chunk to result file.
Use something like Pipeline pattern. We have one input thread, which reads files in async ways and post chunk to several processing queues. Each thread reads own queue and doing chunk processing. Then post result to output queue. Output thread writes result files. So we have 1 input reading thread, 1 output writing thread and several process threads.
Chunk processing is not very fast operation, slower then reading.
OS: Mac/Linux, maybe Windows.
Which approach is better? Do we have any other solutions?
Upvotes: 0
Views: 664
Reputation: 12364
There are certain advantages and disadvantages in both approaches.
Single reader
The read in the processing thread:
BTW, there are more possible processing schemes. One you forgot to mention is to have a single writer thread, where your processing dumps results in the queue and let the background process to write it. This might give you additional boost. There is no need for every thread to wait for writes.
You can also use parallel readers which write in one queue, than the processing takes from this queue (even more complicated programming :-) but works in some cases.
Well, parallel writers could also work.
Also you can thing about distributing your files between different local disks (not directories, but physical disks). This would definitely increase your read/write performance if done in parallel.
Upvotes: 0
Reputation: 14289
The best approach would be to write a simple Task class that does the whole operation (read, process, write) stand-alone, so without any ties to external, thread-unsafe operations. Then use a task queue where a fixed number of threads can fetch those task and process them. A good number of threads is usually cores * 2.
It can be mathematically proven that option 2 will always be equal or slower than the task-based solution, and in all cases it will be more complicated. The only situation where option 2 is more viable is when thread-switching becomes the actual bottleneck. I.E. if you have a server with like 1000 concurrent but stateful connections but only one network card, then it is more efficient to have 1 network thread that feeds the 1000 processing threads, instead of waking up 1000 threads on every byte sent over the line.
The task based solution also makes it much easier to measure the throughput and compare how additional threads affect it, as you can simply measure in tasks / second.
Upvotes: 1
Reputation: 179981
Probably the simplest efficient solution is to have a single reader thread, at lower than default priority. If there is a free CPU core, it gets to run. This creates a single worker thread (process one input file and write it back). As these threads run at default priority, this will self-balance. When all CPU's are busy processing files, the reader thread will not get much CPU time so not a lot of new worker threads are generated.
There's no real point in separating the processing of files and writing them back to disk; that just generates the possibility of a lot of unwritten work queuing up in memory.
Upvotes: 1