Reputation: 3837
I am looking for an example on how to use Parallel.For in C# with a reference type. I have been through the MSDN documentation, and all that I can find are examples that use a value type for thread local storage. The code that I'm trying is as follows:
public string[] BuildStrings(IEnumerable<string> str1, IEnumerable<string> str2, IEnumerable<string> str3)
{
// This method aggregates the strings in each of the collections and returns the combined set of strings. For example:
// str1 = "A1", "B1", "C1"
// str2 = "A2", "B2", "C2"
// str3 = "A3", "B3", "C3"
//
// Should return:
// "A1 A2 A3"
// "B1 B2 B3"
// "C1 C2 C3"
//
// The idea behind this code is to use a Parallel.For along with a thread local storage StringBuilder object per thread.
// Don't need any final method to execute after each partition has completed.
// No example on how to do this that I can find.
int StrCount = str1.Count(); // str1, str2, and str3 guaranteed to be equal in size and > 0.
var RetStr = new string[StrCount];
Parallel.For<StringBuilder>(0, StrCount, () => new StringBuilder(200), (i, j, sb1) =>
{
sb1.Clear();
sb1.Append(str1.ElementAt(i)).Append(' ').Append(str2.ElementAt(i)).Append(' ').Append(str3.ElementAt(i));
RetStr[i] = sb1.ToString();
}, (x) => 0);
return RetStr;
}
This code will not compile on Visual Studio 2013 Express edition. The error is on the Parallel.For line, right after the "(200),":
"Not all code paths return a value in lambda expression of type 'System.Func< int,System.Threading.Tasks.ParallelLoopState,System.Text.StringBuilder,System.Text.StringBuilder>'"
The test code looks like this:
static void Main(string[] args)
{
int Loop;
const int ArrSize = 50000;
// Declare the lists to hold the first, middle, and last names of the clients.
List<string> List1 = new List<string>(ArrSize);
List<string> List2 = new List<string>(ArrSize);
List<string> List3 = new List<string>(ArrSize);
// Init the data.
for (Loop = 0; Loop < ArrSize; Loop++)
{
List1.Add((Loop + 10000000).ToString());
List2.Add((Loop + 10100000).ToString());
List3.Add((Loop + 1100000).ToString());
}
IEnumerable<string> FN = List1;
IEnumerable<string> MN = List2;
IEnumerable<string> LN = List3;
//
// Time running the Parallel.For version.
//
Stopwatch SW = new Stopwatch();
SW.Start();
string[] RetStrings;
RetStrings = BuildMatchArrayOld(FN, MN, LN);
// Get the elapsed time as a TimeSpan value.
SW.Stop();
TimeSpan TS = SW.Elapsed;
// Format and display the TimeSpan value.
string ElapsedTime = TS.TotalSeconds.ToString();
Console.WriteLine("Old RunTime = " + ElapsedTime);
}
I found another somewhat similar question here that also does not compile. But, the accepted answer of using a simpler form of the function does not help me here. I could do that for this particular case, but would really like to know how to use thread local storage with a reference type in the future. Is this a MS bug, or am I missing the proper syntax?
EDIT
I did try this code from this link:
static void Main()
{
int[] nums = Enumerable.Range(0, 1000000).ToArray();
long total = 0;
// Use type parameter to make subtotal a long, not an int
Parallel.For<long>(0, nums.Length, () => 0, (j, loop, subtotal) =>
{
subtotal += nums[j];
return subtotal;
},
(x) => Interlocked.Add(ref total, x)
);
Console.WriteLine("The total is {0:N0}", total);
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
It seems to work fine.
The problem is that when I try to use Parallel.For in my code and specify a return value, it gives other errors:
sb1.Append(str1.ElementAt(i)).Append(' ').Append(str2.ElementAt(i)).Append(' ').Append(str3.ElementAt(i));
This line now generates errors:
Error 'System.Collections.Generic.IEnumerable' does not contain a definition for 'ElementAt' and the best extension method overload 'System.Linq.Enumerable.ElementAt(System.Collections.Generic.IEnumerable, int)' has some invalid arguments
So, I have no clue what the problem is.
Upvotes: 0
Views: 1100
Reputation: 10875
You can also avoid the whole problem by using LINQ and AsParallel()
instead of doing explicit parallelism.
int StrCount = str1.Count(); // str1, str2, and str3 guaranteed to be equal in size and > 0.
var RetStr = from i in Enumerable.Range(0, StrCount)
let sb1 = new StringBuilder(200)
select (sb1.Append(str1.ElementAt(i)).Append(' ').Append(str2.ElementAt(i)).Append(' ').Append(str3.ElementAt(i))).ToString();
return RetStr.AsParallel().ToArray();
This may not be quite as fast, but it's probably a lot simpler.
Upvotes: 1
Reputation: 10875
Here's what your own For
overloads would look like
public static ParallelLoopResult For<TLocal>(int fromInclusive, int toExclusive, Func<TLocal> localInit, Func<int, ParallelLoopState, TLocal, TLocal> body)
{
return Parallel.For(fromInclusive, toExclusive, localInit, body, localFinally: _ => { });
}
static void StringBuilderFor(int count, Action<int, ParallelLoopState, StringBuilder> body)
{
Func<int, ParallelLoopState, StringBuilder, StringBuilder> b = (i, j, sb1) => { body(i, j, sb1); return sb1; };
For(0, count, () => new StringBuilder(200), b);
}
Upvotes: 1
Reputation: 3837
It turns out that the problem with getting the code to compile correctly is a syntax problem. It really would have helped if there had been an example published by Microsoft for this case. The following code will build and run correctly:
public string[] BuildStrings(IEnumerable<string> str1, IEnumerable<string> str2, IEnumerable<string> str3)
{
// This method aggregates the strings in each of the collections and returns the combined set of strings. For example:
// str1 = "A1", "B1", "C1"
// str2 = "A2", "B2", "C2"
// str3 = "A3", "B3", "C3"
//
// Should return:
// "A1 A2 A3"
// "B1 B2 B3"
// "C1 C2 C3"
//
// The idea behind this code is to use a Parallel.For along with a thread local storage StringBuilder object per thread.
// Don't need any final method to execute after each partition has completed.
// No example on how to do this that I can find.
int StrCount = str1.Count(); // str1, str2, and str3 guaranteed to be equal in size and > 0.
var RetStr = new string[StrCount];
Parallel.For<StringBuilder>(0, StrCount, () => new StringBuilder(200), (i, j, sb1) =>
{
sb1.Clear();
sb1.Append(str1.ElementAt(i)).Append(' ').Append(str2.ElementAt(i)).Append(' ').Append(str3.ElementAt(i));
RetStr[i] = sb1.ToString();
return sb1; // Problem #1 solved. Signature of function requires return value.
}, (x) => x = null); // Problem #2 solved. Replaces (x) => 0 above.
return RetStr;
}
So, the first problem, as was pointed out in the comments by Jon Skeet, was that my lambda method failed to return a value. Since I'm not using a return value, I did not put one in - at least initially. When I put in the return statement, then the compiler showed another error with the "ElementAt" static method - as shown above under EDIT.
It turns out that the "ElementAt" error the compiler flagged as being the problem had nothing at all to do with the issue. This tends to remind me of my C++ days when the compiler was not nearly as helpful as the C# compiler. Identifying the wrong line as an error is quite rare in C# - but as can be seen from this example, it does happen.
The second problem was the line (x) => 0). This line is the 5th parameter in the function, and is called by each thread after all its work has been completed. I initially tried changing this to (x) => x.Clear. This ended up generating the error message:
Only assignment, call, increment, decrement, await, and new object expressions can be used as a statement
The "ElementAt" errors were still present as well. So, from this clue I decided that the (x) => 0 might be causing the real issue - minus an error message. Since the work is complete at this point, I changed it to set the StringBuffer object to null since it would not be needed again. Magically, all of the "ElementAt" errors vanished. It built and ran correctly after that.
Parallel.For provides some nice functionality, but I think Microsoft would be well advised to revisit some of the functionality. Any time a line causes a problem, it should be flagged as such. That at least needs to be addressed.
It would also be nice if Microsoft could provide some additional override methods for Parallel.For that would allow void to be returned, and accepting a null value for the 5th parameter. I actually tried sending in a NULL value for that, and it built. But, a run time exception occurred because of this. A better idea is to provide an override for 4 parameters when no "thread completion" method needs to be called.
Upvotes: 1