Dimitar Baldzhiev
Dimitar Baldzhiev

Reputation: 345

C#.net multithreading

I am experimenting on optimizing some mathematical operations using C#.net within a package called Grasshopper (part of Rhino3D). The operation is quite simple but the list on which it has to be performed is big and may get much bigger.

I am using Parallel.ForEach and lists in my C# script and the number of final results I get is lower than what is expected. This is most probably due to the fact that list.add is not thread safe (or not thread safe within the software I'm building it on top of).

  private void RunScript(double z, int x, List<double> y, ref object A)
  {
    List<double> temp = new List<double>();
    double r;
    System.Threading.Tasks.Parallel.ForEach(y, numb =>
      {
      r = Math.Pow((numb * x), z);
      temp.Add(r);
      });
    A = temp;

Please help me figure out a simple and efficient way of running this simple math operation over several hundreds of values using CPU multithreading (or if you have suggestions about GPU CUDA).

I hope that the obscure and specific software does not bother you because as far as I know it performs identically to normal C#.Net/Python/VB.Net.

Upvotes: 15

Views: 1538

Answers (5)

Dimitar Baldzhiev
Dimitar Baldzhiev

Reputation: 345

I was also looking on changing the input a little bit. Splitting the data into separate branches, computing each branch on separate thread and then recombining them at the end. However it scores the worse at 531ms. I understand the script is bad but I think it shows my idea well and if written properly may reach success.No?

  private void RunScript(double z, int x, List<double> y, DataTree<double> u, ref object A)
  {
    System.Threading.Tasks.Task<double[]> th1 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(0).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th2 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(1).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th3 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(2).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th4 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(3).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th5 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(4).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th6 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(5).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th7 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(6).ToArray(), x, z));
    System.Threading.Tasks.Task<double[]> th8 = System.Threading.Tasks.Task<double[]>.Factory.StartNew(() => mP(u.Branch(7).ToArray(), x, z));

    List<double> list = new List<double>();

    list.AddRange(th1.Result);
    list.AddRange(th2.Result);
    list.AddRange(th3.Result);
    list.AddRange(th4.Result);
    list.AddRange(th5.Result);
    list.AddRange(th6.Result);
    list.AddRange(th7.Result);
    list.AddRange(th8.Result);


    A = list;


  }

Sorry, I cannot add stuff to "using"

Upvotes: 0

Peter Duniho
Peter Duniho

Reputation: 70671

You surmise correctly, List<T> is not thread-safe. You must synchronize access to any instance of it.

One option is to simply synchronize in each task:

private void RunScript(double z, int x, List<double> y, ref object A)
{
    List<double> temp = new List<double>();
    object l = new object();
    System.Threading.Tasks.Parallel.ForEach(y, numb =>
    {
      double r = Math.Pow((numb * x), z);
      lock (l) temp.Add(r);
    });
    A = temp;
}

Note: your code had another bug in it also. You were sharing the same r variable amongst all the tasks, which could lead to the same value being added two or more times to the result, while other values were left out. I fixed the bug by simply moving the variable declaration to the body of the anonymous method used for the ForEach() call.


Another option is to recognize that you know in advance how many results you will have, and so can simply initialize an array large enough to contain all the results:

private void RunScript(double z, int x, List<double> y, ref object A)
{
    double[] results = new double[y.Count];
    System.Threading.Tasks.Parallel.For(0, y.Count, i =>
    {
      // read-only access of `y` is thread-safe:
      results[i] = Math.Pow((y[i] * x), z);
    });
    A = new List<double>(results);
}

No two threads will ever try to access the same element in the results array, and the array itself will never change (i.e. be reallocated), so this is perfectly thread safe.

The above assumes that you really do need a List<double> as the output object. Of course, if an array is satisfactory, then you can just assign results to A instead of passing it to the List<T> constructor to create a whole new object at the end.

Upvotes: 15

Dimitar Baldzhiev
Dimitar Baldzhiev

Reputation: 345

Thanks very much for your input! If you are interested in the profiler output is as follows :

Peter Duniho 1st option : 330ms

Peter Duniho 2nd option : 207ms

Dweeberly option: 335ms

Mattias Buelens option: 376ms

this is very strange supposedly .net scripts must run quicker in grasshopper(because it is .net) however none of your solutions beats the python parallel computation of 129ms!

Anyway thank to all you you for the detailed answers! You are great!

Upvotes: 0

Mattias Buelens
Mattias Buelens

Reputation: 20159

A simpler solution would probably be to use .AsParallel() and work on the resulting ParallelEnumerable instead:

private void RunScript(double z, int x, List<double> y, ref object A)
{
    A = y
        .AsParallel().AsOrdered()
        .Select(elem => Math.Pow((elem * x), z))
        .ToList();
}

Upvotes: 7

Dweeberly
Dweeberly

Reputation: 4777

Here is another option:

    private void RunScript(double z, int x, List<double> y, ref object A) {
        var temp = new System.Collections.Concurrent.BlockingCollection<double>();
        System.Threading.Tasks.Parallel.ForEach(y, numb => {
            double r = Math.Pow((numb * x), z);
            temp.Add(r);
        });
        A = temp; // if needed you can A = temp.ToList();
        }

Peter did a good job of outlining the issues with your code and I think the second function he suggests is probably your best option. Still it's nice to see alternatives and learn that the .NET framework contains concurrent safe collections.

Upvotes: 2

Related Questions