Reputation: 21
Could anyone explain to me why the algorithm below is an error-free integer factorization method that always return a non-trivial factor of N. I know how weird this sounds, but I designed this method 2 years ago and still don't understand the mathematical logic behind it, which is making it difficult for me to improve it. It's so simple that it involves only addition and subtraction.
public static long factorX( long N )
{
long x=0, y=0;
long b = (long)(Math.sqrt(N));
long a = b*(b+1)-N;
if( a==b ) return a;
while ( a!= 0 )
{
a-= ( 2+2*x++ - y);
if( a<0 ) { a+= (x+b+1); y++; }
}
return ( x+b+1 );
}
It seems that the above method actually finds a solution by iteration to the diophantine equation:
f(x,y) = a - x(x+1) + (x+b+1)y
where b = floor( sqrt(N) ) and a = b(b+1) - N
that is, when a = 0, f(x,y) = 0 and (x+b+1) is a factor of N.
Example: N = 8509
b = 92, a = 47
f(34,9) = 47 - 34(34+1) + 9(34+92+1) = 0
and so x+b+1 = 127 is a factor of N.
Rewriting the method:
public static long factorX(long N)
{
long x=1, y=0, f=1;
long b = (long)(Math.sqrt(N));
long a = b*(b+1)-N;
if( a==b ) return a;
while( f != 0 )
{
f = a - x*(x+1) + (x+b+1)*y;
if( f < 0 ) y++;
x++;
}
return x+b+1;
}
I'd really appreciate any suggestions on how to improve this method.
Here's a list of 10 18-digit random semiprimes:
349752871155505651 = 666524689 x 524741059 in 322 ms
259160452058194903 = 598230151 x 433211953 in 404 ms
339850094323758691 = 764567807 x 444499613 in 1037 ms
244246972999490723 = 606170657 x 402934339 in 560 ms
285622950750261931 = 576888113 x 495109787 in 174 ms
191975635567268981 = 463688299 x 414018719 in 101 ms
207216185150553571 = 628978741 x 329448631 in 1029 ms
224869951114090657 = 675730721 x 332780417 in 1165 ms
315886983148626037 = 590221057 x 535201141 in 110 ms
810807767237895131 = 957028363 x 847213937 in 226 ms
469066333624309021 = 863917189 x 542952889 in 914 ms
Upvotes: 2
Views: 487
Reputation: 101
Pardon my c#, I don't know Java. Stepping x and y by 2 increases algorithm speed.
using System.Numerics; // needed for BigInteger
/* Methods ************************************************************/
private static BigInteger sfactor(BigInteger k) // factor odd integers
{
BigInteger x, y;
int flag;
x = y = iSqrt(k); // Integer Square Root
if (x % 2 == 0) { x -= 1; y += 1; } // if even make x & y odd
do
{
flag = BigInteger.Compare((x*y), k);
if (flag > 0) x -= 2;
y += 2;
} while(flag != 0);
return x; // return x
} // end of sfactor()
// Finds the integer square root of a positive number
private static BigInteger iSqrt(BigInteger num)
{
if (0 == num) { return 0; } // Avoid zero divide
BigInteger n = (num / 2) + 1; // Initial estimate, never low
BigInteger n1 = (n + (num / n)) >> 1; // right shift to divide by 2
while (n1 < n)
{
n = n1;
n1 = (n + (num / n)) >> 1; // right shift to divide by 2
}
return n;
} // end iSqrt()
Upvotes: 0
Reputation: 20027
Your method is a variant of trial multiplication of (n-a)*(n+b), where n=floor(sqrt(N)) and b==1.
The algorithm then iterates a-- / b++ until the difference of the (n-a)*(n+b) - N == 0.
The partial differences (in respect of a and b) are in proportion to 2b and 2a respectively. Thus no true multiplication are necessary.
The complexity is a linear function of |a| or |b| -- the more "square" N is, the faster the method converges. In summary, there are much faster methods, one of the easiest to understand being the quadratic residue sieve.
Upvotes: 1
Reputation: 1785
OK, I used Matlab to see what was going here. Here is the result for N=100000:
You are increasing x
on each iteration, and the funny pattern of a
variable is strongly related with the remainder N % x+b+1
(as you can see in the gray line of the plot, a + (N % (x+b+1)) - x = floor(sqrt(N))
).
Thus, I think you are just finding the first factor larger than sqrt(N)
by simple iteration, but with a rather obscure criterion to decide it is really a factor :D
(Sorry for the half-answer... I have to leave, I will maybe continue later).
Here is the matlab code, just in case you want it to test by yourself:
clear all
close all
N = int64(100000);
histx = [];
histDiffA = [];
histy = [];
hista = [];
histMod = [];
histb = [];
x=int64(0);
y=int64(0);
b = int64(floor(sqrt(double(N))));
a = int64(b*(b+1)-N);
if( a==b )
factor = a;
else
while ( a ~= 0 )
a = a - ( 2+2*x - y);
histDiffA(end+1) = ( 2+2*x - y);
x = x+1;
if( a<0 )
a = a + (x+b+1);
y = y+1;
end
hista(end+1) = a;
histb(end+1) = b;
histx(end+1) = x;
histy(end+1) = y;
histMod(end+1) = mod(N,(x+b+1));
end
factor = x+b+1;
end
figure('Name', 'Values');
hold on
plot(hista,'-or')
plot(hista+histMod-histx,'--*', 'Color', [0.7 0.7 0.7])
plot(histb,'-ob')
plot(histx,'-*g')
plot(histy,'-*y')
legend({'a', 'a+mod(N,x+b+1)-x', 'b', 'x', 'y'}); % 'Input',
hold off
fprintf( 'factor is %d \n', factor );
Upvotes: 2