Lemth
Lemth

Reputation: 57

How to properly run heavy calculations in Parallel using C#?

The Goal

The goal is to compute all possible polyform shapes of a certain number of squares. Since this is very heavy computation for larger number I wanted to make use of the multiple cores that my computer has.

The Problem

I made the problem easier to explain and test by creating the following scenario:

1) for each value of 2, 3, 5, and 7:
2) find all multiples (up to a certain value) and add them to the same List
3) remove all duplicates from said list

In my final program step 2 is much more vast and computationally heavy, and thus I would prefer to split task two in however many values I want to check based on the values of step 1.

What I Tried

I made a winforms app with C# Core with 5 button trying different variations of parallelism that I found here on Stackoverflow and other places on the internet: enter image description here

Here is the code (which looks like a lot, but it's just 5 variations of the same idea), they all give a count to check if they produced the same result + what time it took:

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Security.Permissions;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace Parallelism
{
    public partial class Form1 : Form
    {
        private readonly int Repeat = 10000000; 

        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            List<int> output = new List<int>();
            foreach (int x in new int[] { 2, 3, 5, 7 })
            {
                for (int i = 0; i < Repeat; i++)
                {
                    output.Add(x * i);
                }
            }
            output = output.Distinct().ToList();
            watch.Stop();
            (sender as Button).Text += $", c:{output.Count} - {watch.ElapsedMilliseconds}ms";
        }

        private void button2_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            ConcurrentBag<int> output = new ConcurrentBag<int>();
            Task task = Task.WhenAll(
              Task.Run(() => button2_Calculation(2, output)),
              Task.Run(() => button2_Calculation(3, output)),
              Task.Run(() => button2_Calculation(5, output)),
              Task.Run(() => button2_Calculation(7, output))
            );
            task.Wait();
            HashSet<int> output2 = new HashSet<int>(output);
            watch.Stop();
            (sender as Button).Text += $", c:{output2.Count} - {watch.ElapsedMilliseconds}ms";
        }
        private void button2_Calculation(int x, ConcurrentBag<int> output)
        {
            for (int i = 0; i < Repeat; i++)
            {
                output.Add(x * i);
            }
        }

        private void button3_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            List<int> output = new List<int>();
            foreach (int x in (new int[] { 2, 3, 5, 7 }).AsParallel())
            {
                for (int i = 0; i < Repeat; i++)
                {
                    output.Add(x * i);
                }
            }
            output = output.Distinct().ToList();
            watch.Stop();
            (sender as Button).Text += $", c:{output.Count} - {watch.ElapsedMilliseconds}ms";
        }

        private void button4_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            ConcurrentBag<int> output = new ConcurrentBag<int>();
            Dictionary<int, Task> runningTasks = new Dictionary<int, Task>();
            foreach (int x in new int[] { 2, 3, 5, 7 })
            {
                int value = x;
                runningTasks.Add(x, Task.Factory.StartNew(() => button2_Calculation(value, output)));
            }
            foreach (Task t in runningTasks.Select(c => c.Value))
                t.Wait();
            HashSet<int> output2 = new HashSet<int>(output);
            watch.Stop();
            (sender as Button).Text += $", c:{output2.Count} - {watch.ElapsedMilliseconds}ms";
        }

        private void button5_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            ConcurrentBag<int> output = new ConcurrentBag<int>();
            Parallel.ForEach(new int[] { 2, 3, 5, 7 }, x => button5_Calculation(x, output));
            HashSet<int> output2 = new HashSet<int>(output);
            watch.Stop();
            (sender as Button).Text += $", c:{output2.Count} - {watch.ElapsedMilliseconds}ms";
        }
        private void button5_Calculation(int x, ConcurrentBag<int> output)
        {
            for (int i = 0; i < Repeat; i++)
                output.Add(x * i);
        }
    }
}

Results So Far

So far all the above methods result in a similar duration between 1s - 1.5s. Actually, sometimes the normal serial executions seems to be a lot faster. How is this possible? I would expect that with 8 cores (16 virtual cores) that splitting the tasks would result in a faster overal speed?

Any help is very much appreciated!

The future

After learning more about how to properly implement parallelism I expect to also run the entirety of the calculations on another thread / Async to allow the GUI to remain responsive.

EDIT:

Response to @Pac0: Here is my implementation of your suggestions. It doesn't seem to make much difference: enter image description here

private void button6_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            ConcurrentBag<HashSet<int>> bag = new ConcurrentBag<HashSet<int>>();
            var output = Parallel.ForEach(new int[] { 2, 3, 5, 7 }, x =>
            {
                HashSet<int> temp = new HashSet<int>();
                for (int i = 0; i < Repeat; i++)
                    temp.Add(x * i);
                bag.Add(temp);
            });
            HashSet<int> output2 = new HashSet<int>();
            foreach (var hash in bag)
                output2.UnionWith(hash);
            watch.Stop();
            (sender as Button).Text += $", c:{output2.Count} - {watch.ElapsedMilliseconds}ms";
        }

Upvotes: 1

Views: 816

Answers (2)

yugami
yugami

Reputation: 379

As a comment mentioned your use of a single collection is causing significant locking. Computationally a task based solution is about 50% faster (see below where we don't manage a combined output). Its managing the collection that's causing some binding. Depending on how its handled it can be upwards of 3 times slower than serial execution.

The struggle with concurrency is always balancing the load to the bottleneck.

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

namespace ConsoleApp5
{
    class Program
    {
        static int Repeat = 100000000;
        static int[] worklist = new int[] { 2, 3, 5, 7 };

        static void Main(string[] args)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();

            Console.WriteLine("Hello World! Launching Threads");
            Task launcher = Task.Run(()=>LaunchThreads());
            launcher.Wait();
            Console.WriteLine("Hello World! Threads Complete");

            watch.Stop();
            Console.WriteLine($"Threads took: {watch.ElapsedMilliseconds}");

            watch = System.Diagnostics.Stopwatch.StartNew();
            Console.WriteLine("Serial Execution Starting");
            foreach (int i in worklist)
            {
                DoWork(i);
            }
            watch.Stop();
            Console.WriteLine($"Serial Execution took: {watch.ElapsedMilliseconds}");
        }
        static async void LaunchThreads()
        {
            //Dictionary<int, List<int>> mywork = new Dictionary<int, List<int>>();
            HashSet<int> output = new HashSet<int>();

            var worktasks = new List<Task<List<int>>>();

            foreach (int i in worklist)
            {
                worktasks.Add(Task.Run(() => DoWork(i)));
            }

            await Task.WhenAll(worktasks);
        }
        static List<int> DoWork(int x)
        {
            Console.WriteLine($"Thread Worker: {x}");
            List<int> output = new List<int>();
            for (int i = 0; i < Repeat; i++)
            {
                output.Add(x * i);
            }

            Console.WriteLine($"Thread Worker: {x} - Exiting");
            return output;
        }
    }
}

Upvotes: 3

Lemth
Lemth

Reputation: 57

I want to post this as an awnser because someone named Yugami posted something that was different from what I tried and it was a useful and good response, but it was deleted.

So I am posting my efforts at recreating their code in my test bench:

private async void button9_Click(object sender, EventArgs e)
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            HashSet<int> output = new HashSet<int>();
            var worktasks = new List<Task<List<int>>>();
            foreach (int i in new int[] { 2, 3, 5, 7 })
                worktasks.Add(Task.Run(() => button9_Calculation(i)));

            await Task.WhenAll(worktasks);
            foreach (Task<List<int>> tsk in worktasks)
                foreach (int i in tsk.Result)
                    output.Add(i);
            watch.Stop();
            (sender as Button).Text += $", c:{output.Count} - {watch.ElapsedMilliseconds}ms";
        }
        private List<int> button9_Calculation(int x)
        {
            List<int> output = new List<int>();
            for (int i = 0; i < Repeat; i++)
                output.Add(x * i);

            return output;
        }

Here are the results of the serial and best two solutions with 100.000.000 tries. Here I finally see some improvement of doing step 2 in parallel, but now the biggest bottleneck is removing the duplicates / filtering it all down to a single HashSet... enter image description here

So I think this solves the initial question that I had to improve step 2. Now I will continue my search to improve on Step 3; removing the duplicates.

Upvotes: 0

Related Questions