JTN
JTN

Reputation: 239

to optimize the nested loops

for( a=1; a <= 25; a++){
  num1 = m[a];
  for( b=1; b <= 25; b++){
    num2 = m[b];
    for( c=1; c <= 25; c++){
      num3 = m[c];
      for( d=1; d <= 25; d++){
        num4 = m[d];
        for( e=1; e <= 25; e++){
          num5 = m[e];
          for( f=1; f <= 25; f++){
            num6 = m[f];
            for( g=1; g <= 25; g++){
              num7 = m[g];
              for( h=1; h <= 25; h++){
                num8 = m[h];
                for( i=1; i <= 25; i++){
                  num = num1*100000000 + num2*10000000 +
                        num3*  1000000 + num4*  100000 +
                        num5*    10000 + num6*    1000 +
                        num7*      100 + num8*      10 + m[i];
                  check_prime = 1;

                  for ( y=2; y <= num/2; y++)
                  {
                    if ( num % y == 0 )
                      check_prime = 0;
                  }

                  if ( check_prime != 0 )
                  {  
                    array[x++] = num;  
                  }
                  num = 0;  
                }}}}}}}}}

The above code takes a hell lot of time to finish executing.. In fact it doesn't even finish executing, What can i do to optimize the loop and speed up the execution?? I am newbie to cpp.

Upvotes: 0

Views: 2930

Answers (3)

Will Ness
Will Ness

Reputation: 71119

We can eliminate a lot of redundant calculations by calculating each part of the number as it becomes available. This also shows the trial division test for primality on 2-3 wheel up to the square root of a number:

// array m[] is assumed sorted in descending order      NB!
// a macro to skip over the duplicate digits
#define I(x) while( x<25 && m[x+1]==m[x] ) ++x;
for( a=1; a <= 25; a++) {
 num1 = m[a]*100000000; 
 for( b=1; b <= 25; b++) if (b != a) { 
  num2 = num1 + m[b]*10000000;
  for( c=1; c <= 25; c++) if (c != b && c != a) {
   num3 = num2 + m[c]*1000000; 
   for( d=1; d <= 25; d++) if (d!=c && d!=b && d!=a) {
    num4 = num3 + m[d]*100000; 
    for( e=1; e <= 25; e++) if (e!=d && e!=c && e!=b && e!=a) {
     num5 = num4 + m[e]*10000;
     for( f=1; f <= 25; f++) if (f!=e&&f!=d&&f!=c&&f!=b&&f!=a) {
      num6 = num5 + m[f]*1000;
      limit = floor( sqrt( num6+1000 ));   ///
      for( g=1; g <= 25; g++) if (g!=f&&g!=e&&g!=d&&g!=c&&g!=b&&g!=a) {
       num7 = num6 + m[g]*100; 
       for( h=1; h <= 25; h++) if (h!=g&&h!=f&&h!=e&&h!=d&&h!=c&&h!=b&&h!=a) { 
        num8 = num7 + m[h]*10; 
        for( i=1; i <= 25; i++) if (i!=h&&i!=g&&i!=f&&i!=e&&i!=d
                                                          &&i!=c&&i!=b&&i!=a) {
          num = num8 + m[i];
          if( num % 2 /= 0 && num % 3 /= 0 ) {
            is_prime = 1;
            for ( y=5; y <= limit; y+=6) {
              if ( num %  y    == 0 ) { is_prime = 0; break; }
              if ( num % (y+2) == 0 ) { is_prime = 0; break; }
            }
            if ( is_prime ) { return( num ); } // largest prime found
          }I(i)}I(h)}I(g)}I(f)}I(e)}I(d)}I(c)}I(b)}I(a)}

This code also eliminates the duplicate indices. As you've indicated in the comments, you pick your numbers out of a 5x5 grid. That means that you must use all different indices. This will bring down the count of numbers to test from 25^9 = 3,814,697,265,625 to 25*24*23*...*17 = 741,354,768,000.

Since you've now indicated that all entries in the m[] array are less than 10, there certain to be duplicates, which need to be skipped when searching. As Daniel points out, searching from the top, the first found prime will be the biggest. This is achieved by pre-sorting the m[] array in descending order.

Upvotes: 0

Daniel Fischer
Daniel Fischer

Reputation: 183988

You're checking 259 = 3,814,697,265,625 numbers whether they're prime. That's a lot of prime tests and will always take long. Even in the best case (for performance) when all array entries (in m) are 0 (never mind that the test considers 0 a prime), so that the trial division loop never runs, it will take hours to run. When all entries of m are positive, the code as is will run for hundreds or thousands of years, since then each number will be trial-divided by more than 50,000,000 numbers.

Looking at the prime check,

check_prime = 1;

for ( y = 2; y <= num/2; y++)
{
    if ( num % y == 0 )
        check_prime = 0;
}

the first glaring inefficiency is that the loop continues even after a divisor has been found and the compositeness of num established. Break out of the loop as soon as you know the outcome.

check_prime = 1;

for ( y = 2; y <= num/2; y++)
{
    if ( num % y == 0 )
    {
        check_prime = 0;
        break;
    }
}

In the unfortunate case that all numbers you test are prime, that won't change a thing, but if all (or almost all, for sufficiently large values of almost) the numbers are composite, it will cut the running time by a factor of at least 5000.

The next thing is that you divide up to num/2. That is not necessary. Why do you stop at num/2, and not at num - 1? Well, because you figured out that the largest proper divisor of num cannot be larger than num/2 because if (num >) k > num/2, then 2*k > num and num is not a multiple of k.

That's good, not everybody sees that.

But you can pursue that train of thought further. If num/2 is a divisor of num, that means num = 2*(num/2) (using integer division, with the exception of num = 3). But then num is even, and its compositeness was already determined by the division by 2, so the division by num/2 will never be tried if it succeeds.

So what's the next possible candidate for the largest divisor that needs to be considered? num/3 of course. But if that's a divisor of num, then num = 3*(num/3) (unless num < 9) and the division by 3 has already settled the question.

Going on, if k < √num and num/k is a divisor of num, then num = k*(num/k) and we see that num has a smaller divisor, namely k (possibly even smaller ones).

So the smallest nontrivial divisor of num is less than or equal to √num. Thus the loop needs only run for y <= √num, or y*y <= num. If no divisor has been found in that range, num is prime.

Now the question arises whether to loop

for(y = 2; y*y <= num; ++y)

or

root = floor(sqrt(num));
for(y = 2; y <= root; ++y)

The first needs one multiplication for the loop condition in each iteration, the second one computation of the square root outside the loop.

Which is faster?

That depends on the average size of num and whether many are prime or not (more precisely, on the average size of the smallest prime divisor). Computing a square root takes much longer than a multiplication, to compensate that cost, the loop must run for many iterations (on average) - whether "many" means more than 20, more than 100 or more than 1000, say, depends. With num larger than 10^8, as is probably the case here, probably computing the square root is the better choice.

Now we have bounded the number of iterations of the trial division loop to √num whether num is composite or prime and reduced the running time by a factor of at least 5000 (assuming that all m[index] > 0, so that always num >= 10^8) regardless of how many primes are among the tested numbers. If most values num takes are composites with small prime factors, the reduction factor is much larger, to the extent that normally, the running time is almost completely used for testing primes.

Further improvement can be obtained by reducing the number of divisor candidates. If num is divisible by 4, 6, 8, ..., then it is also divisible by 2, so num % y never yields 0 for even y > 2. That means all these divisions are superfluous. By special casing 2 and incrementing the divisor candidate in steps of 2,

if (num % 2 == 0)
{
    check_prime = 0;
} else {
    root = floor(sqrt(num));
    for(y = 3; y <= root; y += 2)
    {
        if (num % y == 0)
        {
            check_prime = 0;
            break;
        }
    }
}

the number of divisions to perform and the running time is roughly halved (assuming enough bad cases that the work for even numbers is negligible).

Now, whenever y is a multiple of 3 (other than 3 itself), num % y will only be computed when num is not a multiple of 3, so these divisions are also superfluous. You can eliminate them by also special-casing 3 and letting y run through only the odd numbers that are not divisible by 3 (start with y = 5, increment by 2 and 4 alternatingly). That chops off roughly a third of the remaining work (if enough bad cases are present).

Continuing that elimination process, we need only divide num by the primes not exceeding √num to find whether it's prime or not.

So usually it would be a good idea to find the primes not exceeding the square root of the largest num you'll check, store them in an array and loop

root = floor(sqrt(num));
for(k = 0, y = primes[0]; k < prime_count && (y = primes[k]) <= root; ++k)
{
    if (num % y == 0)
    {
        check_prime = 0;
        break;
    }
}

Unless the largest value num can take is small enough, if, for example, you'll always have num < 2^31, then you should find the primes to that limit in a bit-sieve so that you can look up whether num is prime in constant time (a sieve of 2^31 bits takes 256 MB, if you only have flags for the odd numbers [needs special-casing to check whether num is even], you only need 128 MB to check the primality of numbers < 2^31 in constant time, further reduction of required space for the sieve is possible).

So far for the prime test itself.

If the m array contains numbers divisible by 2 or by 5, it may be worthwhile to reorder the loops, have the loop for i the outermost, and skip the inner loops if m[i] is divisible by 2 or by 5 - all the other numbers are multiplied by powers of 10 before adding, so then num would be a multiple of 2 resp. 5 and not prime.

But, despite all that, it will still take long to run the code. Nine nested loops reek of a wrong design.

What is it that you try to do? Maybe we can help finding the correct design.

Upvotes: 1

David Schwartz
David Schwartz

Reputation: 182893

Replace this code with code using a sensible algorithm, such as the Sieve of Eratosthenes. The most important "optimization" is choosing the right algorithm in the first place.

If your algorithm for sorting numbers is to swap them randomly until they're in order, it doesn't matter how much you optimize the selecting of the random entries, swapping them, or checking if they're in order. A bad algorithm will mean bad performance regardless.

Upvotes: 6

Related Questions