Using GCD with a large set of numbers

I am using PARI/GP which is a mathematics program with some helpful functionality for number theory, especially because it supports very large integers out of the box. For a previous C++ project I had to use a library called BigInt.

At the moment, using PARI/GP I am utilising the gcd() function to calculate the greatest common divisor (GCD) for numbers ranging from 0 to 255 digits in length, so as you can imagine the numbers do get very large! I set a=0 then my loop iterates upwards, each time calculating gcd(a,b) where the b is a long fixed number that never changes.

I was wondering, if perhaps I should use Euler's approach to calculating GCD, which I believe is the following simple formula: gcd(b, a % b) where the % symbol means modulo. Hopefully I got the variables in the correct order!

Is there a rough and quick way to approximate which approach shown above for calculating GCD is quickest? I would, of course, be open minded to other approaches which are quicker.

I do not expect my algorithm to ever finish, this is just an experiment to see how far it can reach based on which approach I use to calculating GCD.

Upvotes: 1

Answers (2)

K.B.

Reputation: 921

The best approach is to let pari do the work for you.

first, you can compute the gcd of a large number of inputs stored in a vector v as gcd(v).

? B=10^255; v = vector(10^6,i,random(B));
? gcd(v);
time = 22 ms.
? a = 0; for(i = 1, #v, a = gcd(a,v[i]))
time = 232 ms. \\ much worse

There are 2 reasons for this to be much faster on such small inputs: loop overhead and variable assignments on the one hand and early abort on the other hand (as soon as the intermediate answer is 1, we can stop). You can multiply v by 2, say, to prevent the second optimization; the simple gcd(v) will remain faster [because loop and assignments overhead still occurs, but in C rather than in interpreted GP; for small inputs this overhead is very noticeable, it will become negligible as the sizes increase]

similarly, it should be always faster on average to let the gcd function work out by itself how best to compute gcd(a,b) than to try and "improve" things by using tricks such as gcd(b, a % b) [Note: the order doesn't matter, and this will error out if b = 0, which gcd is clever enough to check]. gcd(a, b-a) will not error out but slow down things on average. For instance, gcd(a,b) will try an initial Euclidean step in case a and b have vastly differing sizes, it shouldn't help to try and add it yourself.
finally, the exact algorithms used depend on the underlying multiprecision library; either native PARI or GNU's GMP, the latter being faster due to a highly optimized implementation. In both cases, as operands sizes increase, this includes Euclid's algorithm, binary plus/minus [ dividing out powers of 2, we can assume a, b odd, then use gcd(b,(a-b)/4) if a = b mod 4 and gcd(b, (a+b)/4) otherwise; divisions are just binary shifts ], and asymptotically fast half-gcd (almost linear in the bit size). The latter is almost surely not being used in your computations since the threshold should be over 10.000 decimal digits. On the other hand, Euclid's algorithm will only be used for tiny (word-size) operands, but since all algorithms are recursive it will eventualy be used, when the size has become tiny enough.

If you want to investigate the speed of the gcd function, try it with integers around 100.000 decimal digits (then double that size, say), you should observe the almost linear complexity.

Upvotes: 1

David Eisenstat

Reputation: 65458

Binary GCD should generally be better than naive Euclid, but a being very small compared to b is a special circumstance that may trigger poor performance from Binary GCD. I’d try one round of Euclid, i.e., gcd(b, a%b) where gcd is Binary GCD.

(But without knowing the underlying problem here, I’m not sure that this is the best advice.)

Upvotes: 1

Using GCD with a large set of numbers

Answers (2)

Related Questions