Wissam
Wissam

Reputation: 21

Integer Factorization

Could anyone explain to me why the algorithm below is an error-free integer factorization method that always return a non-trivial factor of N. I know how weird this sounds, but I designed this method 2 years ago and still don't understand the mathematical logic behind it, which is making it difficult for me to improve it. It's so simple that it involves only addition and subtraction.

public static long factorX( long N )
{
    long x=0, y=0;
    long b = (long)(Math.sqrt(N));
    long a = b*(b+1)-N;
    if( a==b ) return a;

    while ( a!= 0 )
    {
        a-= ( 2+2*x++ - y);
        if( a<0 ) { a+= (x+b+1); y++; }
    }

return ( x+b+1 );
}

It seems that the above method actually finds a solution by iteration to the diophantine equation:

f(x,y) = a - x(x+1) + (x+b+1)y
where b = floor( sqrt(N) ) and a = b(b+1) - N
that is, when a = 0, f(x,y) = 0 and (x+b+1) is a factor of N.

Example: N = 8509  
b = 92, a = 47  
f(34,9) = 47 - 34(34+1) + 9(34+92+1) = 0  
and so x+b+1 = 127 is a factor of N.  

Rewriting the method:

public static long factorX(long N)
{
long x=1, y=0, f=1;
long b = (long)(Math.sqrt(N));
long a = b*(b+1)-N;
if( a==b ) return a;

  while( f != 0 )
  {
    f = a - x*(x+1) + (x+b+1)*y;
      if( f < 0 ) y++;
    x++;
  }

return x+b+1;
}  

I'd really appreciate any suggestions on how to improve this method.

Here's a list of 10 18-digit random semiprimes:

349752871155505651 = 666524689 x 524741059   in 322 ms
259160452058194903 = 598230151 x 433211953   in 404 ms
339850094323758691 = 764567807 x 444499613   in 1037 ms
244246972999490723 = 606170657 x 402934339   in 560 ms
285622950750261931 = 576888113 x 495109787   in 174 ms
191975635567268981 = 463688299 x 414018719   in 101 ms
207216185150553571 = 628978741 x 329448631   in 1029 ms
224869951114090657 = 675730721 x 332780417   in 1165 ms
315886983148626037 = 590221057 x 535201141   in 110 ms
810807767237895131 = 957028363 x 847213937   in 226 ms
469066333624309021 = 863917189 x 542952889   in 914 ms

Upvotes: 2

Views: 487

Answers (3)

Optionparty
Optionparty

Reputation: 101

Pardon my c#, I don't know Java. Stepping x and y by 2 increases algorithm speed.

using System.Numerics;                      // needed for BigInteger
    /* Methods ************************************************************/
    private static BigInteger sfactor(BigInteger k) // factor odd integers
    {
        BigInteger x, y;
        int flag;
        x = y = iSqrt(k);                   // Integer Square Root
        if (x % 2 == 0) { x -= 1; y += 1; } // if even make x & y odd
        do
        { 
            flag = BigInteger.Compare((x*y), k);
            if (flag > 0) x -= 2;
            y += 2;
        } while(flag != 0);
        return x;                           // return x
    } // end of sfactor()

    // Finds the integer square root of a positive number
    private static BigInteger iSqrt(BigInteger num)
    {
        if (0 == num) { return 0; }           // Avoid zero divide
        BigInteger n = (num / 2) + 1;         // Initial estimate, never low
        BigInteger n1 = (n + (num / n)) >> 1; // right shift to divide by 2
        while (n1 < n)
        {
            n = n1;
            n1 = (n + (num / n)) >> 1;        // right shift to divide by 2
        }
        return n;
    } // end iSqrt()

Upvotes: 0

Aki Suihkonen
Aki Suihkonen

Reputation: 20027

Your method is a variant of trial multiplication of (n-a)*(n+b), where n=floor(sqrt(N)) and b==1.

The algorithm then iterates a-- / b++ until the difference of the (n-a)*(n+b) - N == 0.

The partial differences (in respect of a and b) are in proportion to 2b and 2a respectively. Thus no true multiplication are necessary.

The complexity is a linear function of |a| or |b| -- the more "square" N is, the faster the method converges. In summary, there are much faster methods, one of the easiest to understand being the quadratic residue sieve.

Upvotes: 1

dunadar
dunadar

Reputation: 1785

OK, I used Matlab to see what was going here. Here is the result for N=100000: enter image description here You are increasing x on each iteration, and the funny pattern of a variable is strongly related with the remainder N % x+b+1 (as you can see in the gray line of the plot, a + (N % (x+b+1)) - x = floor(sqrt(N))). Thus, I think you are just finding the first factor larger than sqrt(N) by simple iteration, but with a rather obscure criterion to decide it is really a factor :D (Sorry for the half-answer... I have to leave, I will maybe continue later).

Here is the matlab code, just in case you want it to test by yourself:

clear all
close all

N = int64(100000);

histx = [];
histDiffA = [];
histy = [];
hista = [];
histMod = [];
histb = [];

x=int64(0);
y=int64(0);
b = int64(floor(sqrt(double(N))));
a = int64(b*(b+1)-N);
if( a==b ) 
    factor = a;
else
    while ( a ~= 0 )
        a = a - ( 2+2*x - y);
        histDiffA(end+1) = ( 2+2*x - y);
        x = x+1;
        if( a<0 ) 
            a = a + (x+b+1); 
            y = y+1;
        end
        hista(end+1) = a;
        histb(end+1) = b;
        histx(end+1) = x;
        histy(end+1) = y;
        histMod(end+1) = mod(N,(x+b+1));
    end
    factor = x+b+1;
end

figure('Name', 'Values');
hold on
    plot(hista,'-or')
    plot(hista+histMod-histx,'--*', 'Color', [0.7 0.7 0.7])
    plot(histb,'-ob')
    plot(histx,'-*g')
    plot(histy,'-*y')
    legend({'a', 'a+mod(N,x+b+1)-x', 'b', 'x', 'y'}); % 'Input',
hold off

fprintf( 'factor is %d \n', factor );

Upvotes: 2

Related Questions