Reputation: 49

How do I make this code work for large input values?

#include <stdio.h>

int main()
{
    int i,j,k,t;
    long int n;
    int count;
    int a,b;
    float c;

    scanf("%d",&t);
    for(k=0;k<t;k++)
    {
        count=0;
        scanf("%d",&n);
        for(i=1;i<n;i++)
        {

            a=pow(i,2);
            for(j=i;j<n;j++)
            {
                b=pow(j,2);
                c=sqrt(a+b);
                if((c-floor(c)==0)&&c<=n)
                ++count;


            }


        }
        printf("%d\n",count);
    }
    return 0;
}

The above is a c code that counts the number of Pythagorean triplets within range 1..n. How do I optimize it ? It times out for large input . 1<=T<=100 1<=N<=10^6

Upvotes: 1

Answers (3)

uesp

Reputation: 6204

Your inner two loops are O(n*n) so there's not too much that can be done without changing algorithms. Just looking at the inner loop the best I could come up with in a short time was the following:

unsigned long long int i,j,k,t;
unsigned long long int n = 30000;  //Example for testing
unsigned long long int count = 0;
unsigned long long int a, b;
unsigned long long int c;
unsigned long long int n2 = n * n;

for(i=1; i<n; i++)
{
    a = i*i;

    for(j=i; j<n; j++)
    {
        b = j*j;

        unsigned long long int sum = a + b;
        if (sum > n2) break;

              // Check for multiples of 2, 3, and 5
        if ( (sum & 2) || ((sum & 7) == 5) || ((sum & 11) == 8) ) continue;

        c = sqrt((double)sum);
        if (c*c == sum) ++count;
    }
}

A few comments:

For the case of n=30000 this is roughly twice as fast as your original.
If you don't mind n being limited to 65535 you can switch to unsigned int to get a x2 speed increase (or roughly x4 faster than your original).
The check for multiples of 2/3/5 increases the speed by a factor of two. You may be able to increase this by looking at the answers to this question.
Your original code has integer overflows when i > 65535 which is the reason I switched to 64-bit integers for everything.
I think your method of checking for a perfect square doesn't always work due to the inherent in-precision of floating point numbers. The method in my example should get around that and is slightly faster anyways.
You are still bound to the O(n*n) algorithm. On my machine the code for n=30000 runs in about 6 seconds which means the n=1000000 case will take close to 2 hours. Looking at Wikipedia shows a host of other algorithms you could explore.

Upvotes: 1

alexpq

Reputation: 36

You can define your indexes as (unsigned) long or even (unsigned) long long, but you may have to use big num libraries to solve your problem for huge numbers. Using unsigned uppers your Max number limit but forces you to work with positive numbers. I doubt you'll need bigger than long long though. It seems your question is about optimising your code to make it faster. If you read up on Pythagorean triplets you will see there is a way to calculate them using integer parameters. If 3 4 5 are triplets then we know that 2*3 2*4 2*5 are also triplets and k*3 k*4 k*5 are also triplets. Your algorithm is checking all of those triplets. There are better algorithms to use, but I'm afraid you will have to search on Google to study about Pythagorean triplets.

Upvotes: 0

dbgrman

Reputation: 5681

It really depends on what the benchmark is that you are expecting.

But for now, the power function could be a bottle neck in this. I think you can do either of the two things:

a) precalculate and save in a file and then load into a dictionary all the squared values. Depending on the input size, that might be loading your memory.

b) memorize previously calculated squared values so that when asked again, you could reuse it there by saving CPU time. This again, would eventually load your memory.

Upvotes: 0

How do I make this code work for large input values?

Answers (3)

Related Questions