Reputation: 15139
I am trying to use Parallel.ForEach on a list and for each item in the list, trying to make a database call. I am trying to log each item with or without error. Just wanted to check with experts here If I am doing thinsg right way. For this example, I am simulating the I/O using the File access instead of database access.
static ConcurrentQueue<IdAndErrorMessage> queue = new ConcurrentQueue<IdAndErrorMessage>();
private static void RunParallelForEach()
{
List<int> list = Enumerable.Range(1, 5).ToList<int>();
Console.WriteLine("Start....");
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
Parallel.ForEach(list, (tempId) =>
{
string errorMessage = string.Empty;
try
{
ComputeBoundOperationTest(tempId);
try
{
Task[] task = new Task[1]
{
Task.Factory.StartNew(() => this.contentFactory.ContentFileUpdate(content, fileId))
};
}
catch (Exception ex)
{
this.tableContentFileConversionInfoQueue.Enqueue(new ContentFileConversionInfo(fileId, ex.ToString()));
}
}
catch (Exception ex)
{
errorMessage = ex.ToString();
}
if (queue.SingleOrDefault((IdAndErrorMessageObj) => IdAndErrorMessageObj.Id == tempId) == null)
{
queue.Enqueue(new IdAndErrorMessage(tempId, errorMessage));
}
}
);
Console.WriteLine("Stop....");
Console.WriteLine("Total milliseconds :- " + stopWatch.ElapsedMilliseconds.ToString());
}
Below are the helper methods :-
private static byte[] FileAccess(int id)
{
if (id == 5)
{
throw new ApplicationException("This is some file access exception");
}
return File.ReadAllBytes(Directory.GetFiles(Environment.SystemDirectory).First());
//return File.ReadAllBytes("Files/" + fileName + ".docx");
}
private static void ComputeBoundOperationTest(int tempId)
{
//Console.WriteLine("Compute-bound operation started for :- " + tempId.ToString());
if (tempId == 4)
{
throw new ApplicationException("Error thrown for id = 4 from compute-bound operation");
}
Thread.Sleep(20);
}
private static void EnumerateQueue(ConcurrentQueue<IdAndErrorMessage> queue)
{
Console.WriteLine("Enumerating the queue items :- ");
foreach (var item in queue)
{
Console.WriteLine(item.Id.ToString() + (!string.IsNullOrWhiteSpace(item.ErrorMessage) ? item.ErrorMessage : "No error"));
}
}
Upvotes: 1
Views: 2639
Reputation: 2554
You might want to remove Console.WriteLine
from thread code. Reason being there can be only one console per windows app. So if two or more threads going to write parallel to console, one has to wait.
In replacement to your custom error queue you might want to see .NET 4's Aggregate Exception and catch that and process exceptions accordingly. The InnerExceptions
propery will give you the necessary list of exceptions. More here
And a general code review comment, don't use magic numbers like 4
in if (tempId == 4)
Instead have some const defined which tells what 4 stands for. e.g. if (tempId == Error.FileMissing)
Upvotes: 1
Reputation: 35881
Parallel.ForEach
runs an action/func concurrently up to a certain number of simultaneous instances. If what each of those iterations is doing is not inherently independent on one another, you're not getting any performance gains. And, likely are reducing performance by introducing expensive context switching and contention. You say that you want to do a "database call" and simulating it with a file operation. If each iteration uses the same resource (same row in a database table, for example; or try to write to the same file in the same location) then they're not really going to be run in parallel. only one will be running at a time, the others will simply be "waiting" to get a hold of the resource--needlessly making your code complex.
You haven't detailed what you want to do for each iteration; but when I've encountered situations like this with other programmers, they almost always aren't really doing things in parallel and they've simply gone through and replaced foreach
s with Parallel.ForEach
in the hopes of magically gaining performance or magically making use of multi-CPU/Core processors.
Upvotes: 0
Reputation: 564413
There is no reason to do this:
/*Below task is I/O bound - so do this Async.*/
Task[] task = new Task[1]
{
Task.Factory.StartNew(() => FileAccess(tempId))
};
Task.WaitAll(task);
By scheduling this in a separate task, and then immediately waiting on it, you're just tying up more threads. You're better off leaving this as:
/*Below task is I/O bound - but just call it.*/
FileAccess(tempId);
That being said, given that you're making a logged value (exception or success) for every item, you might want to consider writing this into a method and then just calling the entire thing as a PLINQ query.
For example, if you write this into a method that handles the try/catch (with no threading), and returns the "logged string", ie:
string ProcessItem(int id) { // ...
You could write the entire operation as:
var results = theIDs.AsParallel().Select(id => ProcessItem(id));
Upvotes: 2