lysergic-acid
lysergic-acid

Reputation: 20050

Potential benefit of async and Parallel.ForEach for IO operations

I am developing and maintaining a .NET 3.5 tool at work, and wondering whether a potential gain in performance can be gained by using .NET 4's new TPL or even the new async features which are still in CTP.

The tool's work can be roughly described as:

  1. Retrieve a list of container files (currently .MSI files) -- a few dozens of them, ~ 50-70
  2. Iterate over each file, and construct a runtime object representing it.
  3. For each runtime object created, perform some queries on its contents (compare its contents with some files on the system).

Items #2 and #3 are the lengthy ones, and i would like to get some opinions on the potential of improving the execution time (which is a few minutes right now) by using Parallel.ForEach or other methods for executing this work in parallel.

Potential improvements i am foreseeing are:

Making use of multiple CPUs/cores Keeping the app running while IO operations (like reading files) are being done to do something else.

Would you think this kind of application can benefit from these, before jumping into development?

Upvotes: 0

Views: 1365

Answers (3)

Henk Holterman
Henk Holterman

Reputation: 273784

Would you think this kind of application can benefit from these, before jumping into development?

Not very much. You describe a 3-stage system in which every stage is heavily I/O bound.

I assume you have only 1 Disk, that means running in parallel could even slow it down (more Seek operations).

On the other hand stage 2) and 3) could be CPU intensive enough to see some improvement.

You will have to measure, as usual.

Upvotes: 1

Adam Ralph
Adam Ralph

Reputation: 29956

I would run a profiler to see where your application is spending time and then decide. If you find it is waiting for I/O completion then you may find benefit from using the Asynchronous Programming Model. If you find you are compute bound, then, depending on your anticipated runtime environment (multi-core/single core), you may find multi-threaded computation to be of benefit. Of course, you may find that both cases apply.

Incidentally, you can also use many of the .NET 4 threading features in .NET 3.5 by using Reactive Extensions. I am currently using this in a productive .NET 3.5 application.

Upvotes: 3

Reed Copsey
Reed Copsey

Reputation: 564851

This definitely may get some improvements by using the TPL, which is available now in .NET 4.

All three steps could potentially be designed to run in parallel.

That being said, it's difficult, given the above, to know how much improvement you would see. The main issue is the heavy file I/O. Even if you take advantage of multiple cores, the disk I/O will likely become a bottleneck, and trying to run this in parallel may actually slow down those portions of the code.

If you're doing a huge amount of IO in relation to the queries/computations, then you may not get a very large performance benefit just by running the routines in parallel.

Upvotes: 4

Related Questions