Reputation: 117
I'm currently doing a primality test on huge numbers (up to 10M digits).
Right now, I'm using a c program using the GMP library. I did some parallelization using OpenMP and got a nice speedup (3.5~ with 4 cores). The problem is that I don't have enough CPU cores to make it feasible to run with my whole dataset.
I have an NVidia GPU and, I tried to find an alternative to GMP, but for GPUs. It can be either CUDA or OpenCL.
Is there an arbitrary precision library that I can run on my GPU? I'm also open to using another programming language if there is a simple or more elegant way of doing it.
Upvotes: 8
Views: 2956
Reputation: 307
You could use my library for signed/unsigned integer arithmetic operations, as well as some number theory functions such as POW, GCD, LCM:
#include <Aeu.h>
__global__ void test() {
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
if(tid != 0) return;
Aeu<128> amp = 1562144106091796071UL; // 128-bit long unsigned integer
printf("Were in kernel thread and number is %lu\n", amp.integralCast<unsigned long>());
}
int main() {
test<<<32, 32>>>();
return cudaSuccess != cudaDeviceSynchronize();
}
Upvotes: 1
Reputation: 462
It seems the Julia Language is already able to do multiprecision arithmetic and use the GPU (see here for a simple example combining these two), but you might have to learn Julia and rewrite your code.
The CUMP library is meant to be a GMP substitute for CUDAs, it attempts to make it easier for porting GMP code to CUDAs by offering an interface similar to GMP, for example you replace mpf_...
variables and functions by cumpf_...
ones. There is a demo you can make
if it fits your CUDA. No documentation or support though, you'll have to go through the code to see if it works.
The CAMPARY library from people in LAAS-CNRS might be a shot as well, but no documentation either. It has been applied in more research than CUMP, so some chance there. An answer here gives some clarification on how to use it.
GRNS uses the residue number system on CUDA-compatible processors, it has no documentation but an interesting paper is available. Also, see this one.
XMP comes directly from NVIDIA labs, but seems incomplete and has no docs also. Some info here and a paper here.
XMP 2.0 seems newer but only support sizes up to 32k bits for now.
GPUMP looks promising, but seems not available for download, perhaps by contacting the authors.
The MPRES-BLAS is a multiprecision library for linear algebra on CUDAs, and does - of course - have code for basic arithmetic. Also, papers here and here.
I haven't tested any of those, so please let us know if any of those work well.
Upvotes: 9