Reputation: 43
The number is
112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869
My code is below
def getfactors(number):
factors = []
for potentialFactor in range(1 , int(math.sqrt(number)) + 1):
if number % potentialFactor == 0:
factors.append(potentialFactor)
return factors
and the input is
getfactors(112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869)
The program has been running for at least 3 hours and I still have no results from it yet. The code works with other numbers too. Is there any algorithm or method that I could use to speed this up?
Upvotes: 3
Views: 852
Reputation: 7607
I have authored a library for the R
programming language called RcppBigIntAlgos
that can factor these types of numbers in a reasonable amount of time.
As others have pointed out, your number is going to be very difficult to factor by any program you roll yourself unless you invest a considerable amount of time into it. For me personally, I have invested thousands of hours. Your mileage may vary.
Here is my test run in a vanilla R
console (no special IDE):
numDig99 <- "112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869"
## install.packages("RcppBigIntAlgos") if necessary
library(RcppBigIntAlgos)
prime_fac <- quadraticSieve(numDig99, showStats = TRUE, nThreads = 8)
Summary Statistics for Factoring:
#> 112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869
#>
#> | MPQS Time | Complete | Polynomials | Smooths | Partials |
#> |--------------------|----------|-------------|------------|------------|
#> | 6h 37m 48s 335ms | 100% | 11591331 | 8768 | 15707 |
#>
#> | Mat Algebra Time | Mat Dimension |
#> |--------------------|--------------------|
#> | 27s 745ms | 24393 x 24475 |
#>
#> | Total Time |
#> |--------------------|
#> | 6h 38m 17s 401ms |
And just as @kelalaka, we obtain the same hashed values (Again, ran right in the R
console):
system(sprintf("printf %s | openssl sha512", prime_fac[1]))
#> faebc6b3645d45f76c1944c6bd0c51f4e0d276ca750b6b5bc82c162e1e9364e01aab42a85245658d0053af526ba718ec006774b7084235d166e93015fac7733d
system(sprintf("printf %s | openssl sha512", prime_fac[2]))
#> 27c64b5b944146aa1e40b35bd09307d04afa8d5fa2a93df9c5e13dc19ab032980ad6d564ab23bfe9484f64c4c43a993c09360f62f6d70a5759dfeabf59f18386
Also, to show that the factorization is indeed correct:
## The product of the result is indeed the original number
prod(prime_fac) == numDig99
#> [1] TRUE
## There are two factors
length(prime_fac)
#> [1] 2
## They are prime. The gmp package uses the Miller-Rabin primality test
gmp::isprime(prime_fac)
#> [1] 1 1
The algorithm implemented in RcppBigIntAlgos::quadraticSieve
is the Multiple Polynomial Quadratic Sieve. There is a more efficient version of the quadratic sieve known as the Self Initializing Quadratic Sieve, however, there isn't as much literature in the wild available.
Here are my machine specs:
MacBook Air (2022)
Processor: Apple M2
Memory; 24 GB
And here is my R
info:
sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.4.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: America/New_York
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] RcppBigIntAlgos_1.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_4.3.1 gmp_0.7-2
Upvotes: 1
Reputation: 5636
Your method will take a lot of time to factor in the given number since the RSA primes are close to each other. Even sieving with Sieve of Eratosthenes won't help, since you have a 326-bit number. Can you sieving to 163-bit, there is no way. This is slightly larger than the first RSA challenge RSA-100 that has 300-bit.
Use existing libraries like the
The experiments
I have tried with Pollard's p-1 algorithm, still running for one and a half-day and did not produce a result, yet. This is what expected due to the B bound must be around 2^55
with success probability 1/27
. I've stopped the experiment after the CADO-NFS succeeds. This is self-implemented Pollard's p-1
, one can find an optimized in GMP-ECM
Tried the CADO-NFS. The stable version may not be easily compiled for new machines, so prefer the active one from the GitLab.
After ~6 hours with 4 cores, CADO-NFS produced the result. As expected this is an RSA CTF/Challange. Since I don't want to spoil the fun; here the hash commitments with SHA-512, it is executed with OpenSSL;
echo -n "prime x" | openssl sha512
27c64b5b944146aa1e40b35bd09307d04afa8d5fa2a93df9c5e13dc19ab032980ad6d564ab23bfe9484f64c4c43a993c09360f62f6d70a5759dfeabf59f18386
faebc6b3645d45f76c1944c6bd0c51f4e0d276ca750b6b5bc82c162e1e9364e01aab42a85245658d0053af526ba718ec006774b7084235d166e93015fac7733d
Details of the experiment
CPU : Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
RAM : 32GB - doesn't require much ram, at least during polynomial selection and Sieveing.
Dedicated cores : 4
Test machine Ubuntu 20.04.1 LTS
CUDA - NO
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
cmake version 3.16.3
external libraries: Nothing out of Ubuntu's canonicals
CODA-NFS version : GitLab develepment version cloned at 23-01-2021
The bit sizes;
The cado-nfs-2.3.0
did not compile and giving errors about HWLOC- HWLOC_TOPOLOGY_FLAG_IO_DEVICES
. Asked a friend to test the compile and it worked for them. It was an older Linux version. So I decided to use the GitLab version.
Upvotes: 11
Reputation: 15693
What do you know about this number? If it is an RSA public key then it only has two large prime factors. If it is a random number then it will very probably have small prime factors. The type of number will determine how you want to approach factorising it.
Two ancillary functions will also be useful. First a Sieve of Eratosthenes to build a list of primes up to, say 50,000 or some convenient limit. Second a large number prime test, such as Miller-Rabin, to check if the residue is prime or not.
Use the sieve of Eratosthenes to give you all the small primes up to a convenient limit. Test for each prime in turn up to the square root of the target number. When you find a prime that divides the test number, divide the test number to make it smaller. A prime may divide in more than once. Reset the prime limit to the square root of the smaller number once all the divisions are finished.
if (numToTest MOD trialFactor = 0)
repeat
listOfFactors.add(trialFactor)
numToTest <- numToTest/trialFactor
until (numToTest MOD trialFactor != 0)
primeLimit <- sqrt(numTotest)
endif
Once the number you are testing has been reduced to 1, you have completely factored it.
If you run out of primes before completely factoring the target, it is worth running the Miller-Rabin test 64 times to see if the remainder is itself prime; potentially that can save you a lot of work trying to find non-existent factors of a large prime. If the remainder is composite then you can either try again with a larger sieve or use one of the heavy duty factoring methods: Quadratic Sieve, Elliptic Curve etc.
Upvotes: 2