Reputation: 83
Currently our application is processing a large amount of files about over 1000 XML files on the same directory. The files are all being read, parsed and updated/saved to the database.
When we tested our application on a 12 core machine the total process is much slower than processing it on a 4 core machine.
What we observed is that the thread count produced by our application goes up to a range of 30 to 90 threads and the Context Switches is massively increasing. This is possibly caused by a lot of parallel execution being spawned but all of them are important.
Is the context switch the culprit? or the parallel read/write of files? or do we lessen the number of parallel tasks?
Upvotes: 0
Views: 67
Reputation: 700182
The bottle neck here is the disk access. No matter how many threads you start, the file system can only read one file at a time. Starting more threads will only make them fight over this single resource, increasing both the context switching and the disk seek times.
In the other end of the process is also a limitation as only one thread at a time can update a table in the database, but the database is designed to handle multiple processes.
Make a single thread responsible for the disk reads, and once a file has been read it can start a thread that processes it. That way you read from the disk in the most efficient way, and you have the multi threaded part of the operation behind the bottle neck.
Upvotes: 1