Benkevitch
Benkevitch

Reputation: 425

cuRand Mersenne twister __device__ side kernel code example

I am working on Markov Chain Monte-Carlo (MCMC) algorithm implementation on NVIDIA CUDA GPU. The CPU MCMC algorithm uses the high quality Mersenne twister random number generator, and I would like to use the same in the GPU kernels I wrote. I have been searching for cuRand MT code examples for long. Unfortunately, I have never seen any example of a kernel code that uses the Mersenne twister. The standard cuRand library documentation provides a set of functions for MTGP (MT for Graphic Processor), but it is not clear how to use them.

The CUDA Samples provide MersenneTwisterGP11213.tar.gz with an example, but it seems to be exclusively for a host code that requests fast generation of an array of random numbers on GPU, downloads them to CPU memory, and proceeds on CPU. There is also a paper "Massively Parallel RNG using CUDA C, Thrust and C#". Again, the author in the last section "A Mersenne Twister implementation using CUDA C" provides just a simplified piece of the aforementioned host code from the "CUDA Samples".

So, my first question is: can anybody give me an example of global or device function that uses the cuRand Mersenne twister?

I have one more question. Currently I use a cuRand library random number generator and I have no idea what generator is used! Let me provide a couple pieces of my code. This is the generator initialization:

 __global__ void init_rng(Cmcmcfit *mc) {

        int ist = threadIdx.x*gridDim.x + blockIdx.x;

        if (ist >= mc->nrndst) return; // The last block can have extra threads

        unsigned long long offset = 0;

        curand_init(mc->seed, ist, offset, &mc->rndst[ist]);
}

In other kernels I sample numbers from the uniform and normal distributions. The array of states for all the blockDim.x*gridDim.x generators is saved in the global memory, array mc->rndst[]. For example, curand_uniform() is used:

  .   .   .   .   .   .
  do { /* Randomly select parameter number k to make step */
    r = curand_uniform(&mc->rndst[ist]);
    k = (int) (mc->nprm*r); /* Random parameter index 0..nprm-1 into ivar[] */
  } while (k >= mc->nprm);
  .   .   .   .   .   .   .   .   .

Or, to sample from the Gaussian distribution, curand_normal() is used:

  std = mc->pstp[(Nbeta*k + Ibeta)*Nseq + Iseq]; /* pstp[k,ibeta,iseq] */
  randn = curand_normal(&mc->rndst[ist]);
  p = p + std*randn;

Can anybody tell me which of the cuRand generators (xorwow, lcs, mtgp ...) is used here (actually, by default)?

Upvotes: 2

Views: 3696

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 151829

The curand documentation includes a section on device API examples. The second example there uses MTGP to generate random numbers in device code, and then in the same kernel a basic computation is done on the random numbers generated (count the number which have lowest bit set.) This seems to be what you're asking for (how to generate random numbers on the device and use them in device code). Is something missing there?

Also, in the documentation, it indicates that the default generator used by curand is XORWOW:

The default pseudorandom generator, XORWOW,...

and here also.

Upvotes: 3

Related Questions