Ram
Ram

Reputation: 67

Parallel not faster than sequential?

Trying to come up with strategies for finding next prime:

Algo #1 (Parallel):

    private static int NextPrime(int p)
    {
        int nextP = 2 * p; // there is always a prime between n & 2*n !
        Enumerable.Range(p + 1, p)
                  .AsParallel()
                  .ForAll(g =>
                  {
                      bool prime = true;
                      for (int i = 2; i <= Math.Sqrt(g); i++)
                      {
                          if (g % i == 0)
                          {
                              prime = false;
                              break;
                          }
                      }
                      if (prime)
                      {
                          if (g < nextP)
                              nextP = g;
                          return;
                      }
                  });

        return nextP;
    }

Algo #2 (Sequential):

    private static int NextPrimeNonParallel(int p)
    {
        int nextP = 2 * p; // there is always a prime between n & 2*n !
        foreach (var g in Enumerable.Range(p + 1, p))
        {
            bool prime = true;
            for (int i = 2; i <= Math.Sqrt(g); i++)
            {
                if (g % i == 0)
                {
                    prime = false;
                    break;
                }
            }
            if (prime)
            {
                if (g < nextP)
                    nextP = g;
                return nextP;
            }
        }
        return 0;
    }

Sequential is significantly faster than parallel :) Why? Or is my parallel algo really well written?

Thanks!

Upvotes: 2

Views: 1060

Answers (3)

N&#246;rden
N&#246;rden

Reputation: 155

2 reasons why a parallel program is slower than a sequential one.

1: Because of communication overhead,i.e., the time for information exchange between processors or threads or synchronization, and idle times(The time when a processor can not do anything useful but wait for a event to happen), it can also depend on that the distribution of work to each processor is not equally diveded.

  1. According to Amdahls low, if a part of the program cannot be paralleized then no matter of increasing processor numbers, because the minimum execution time can not be less than the sequential fraction time of execution.

Upvotes: 0

Ali Ferhat
Ali Ferhat

Reputation: 2579

@Joachim's answer is correct. I only want to add a couple of points:

  • Your primality testing algorithm is naive. If you are dealing with really large numbers and you want a really fast algorithm, there are complicated but asymptotically faster algorithms, see http://en.wikipedia.org/wiki/AKS_primality_test for example.

  • If you don't want that much complexity, but want to speed up your algorithm in a simple way:

    1. When checking the primality of M, divide it only to the prime numbers in 2..sqrt(M)
    2. So first find all the prime numbers in 2..sqrt(M).
    3. Memorize the list in 2nd step to speed up future queries.

    Suppose you want to find the smallest prime larger than 10^12. My suggestion assumes you have enough space to store all primes smaller than 10^6. It also assumes you will run many similar queries to make the finding all primes smaller than 10^6 step worthwhile.

Upvotes: 0

Joachim Isaksson
Joachim Isaksson

Reputation: 180987

The reason is that the parallel ForAll won't terminate when it finds the first prime but will always loop though the whole range before returning, while your non parallel version will return the first value right away when its found.

This will also make the parallel version buggy, since it won't return the first found prime in the range, but actually the last found one since it overwrites nextP every time a new prime is found in the range. Note that that may not be the highest prime in the range, just the last found. Since you're running tasks, elements may be handled out of order.

What you probably want to do instead is to use this version of Parallel.ForEach that gives you a ParallelLoopState with every iteration. If you call Break() (note: not Stop() due to out of order execution you may get a too large value then) on that object, the loop will break at earliest convenience. You will also need to make writing to nextP synchronized (it's shared state between all threads) so that it only saves the smallest value, a'la;

lock(lockObject) {
  if(value<nextP) nextP = value;
}

That, all in all, should allow you to run in parallel with greater performance.

Upvotes: 4

Related Questions