Psypher
Psypher

Reputation: 436

CUDA: PRNG generator and state

I have been playing around with cuRAND library to understand the concept of a generator and state. The term state has yet puzzled me so I would like to clarify my understanding with the help of some code.

int main () {

float *hData, *dData;
curandState *devState;

size_t nThreads = 4;
size_t nRows    = 10;
size_t n        = nRows * nThreads;

hData = new float[n];
cudaMalloc((void**)&dData, n * sizeof(float));
cudaMemset(dData, 0, n * sizeof(float));

cudaMalloc((void**)&devState, nThreads * sizeof(curandState));
initCurand <<< 1, nThreads >>> (devState, 1234);
cudaDeviceSynchronize();

testrand <<< 1, nThreads >>> (devState, dData, nRows);
cudaDeviceSynchronize();
cudaMemcpy(hData, dData, n*sizeof(int), cudaMemcpyDeviceToHost);

cudaFree(devState);
cudaFree(dData);
free(hData);
}

Below are the kernels.

__global__ void initCurand(curandState *state, unsigned long seed)
{
    curand_init(seed, threadIdx.x, 0, &state[threadIdx.x]);
}

__global__ void testrand(curandState *state, float *d1, int rows)
{

    int idx    = threadIdx.x;
    int stride = blockDim.x;
    for (int i = 0; i < rows; i++)
    {
        d1[idx + i * stride] = curand_uniform(&state[idx]);
    }
}

Simply put, there are as many states as the number of threads. Each thread consume 10 random numbers from THEIR respective state. Here is the output:

0.145468 0.820181 0.550399 0.294830 
0.434899 0.926417 0.811845 0.308556 
0.870710 0.511765 0.782640 0.620706 
0.455165 0.537594 0.742539 0.535606 
0.857093 0.809246 0.541354 0.497212 
0.582418 0.017524 0.195556 0.898062 
0.201404 0.449338 0.006050 0.041652 
0.786745 0.799349 0.093755 0.994597 
0.300772 0.136307 0.648018 0.970036 
0.366787 0.377424 0.096621 0.495483 

Can I therefore conclude that state is actually a generator itself? Each generator/state generates a uniform distribution of random number and different state maintain a distance such that the numbers generated do not repeat. ==> Q1

From the cuRAND guide, I learned that creating a lot of states are compute intensive. also experimented to see that. How can generate random numbers using a single state (variable) such that each thread consumes different random number (or the next random number is queue) from the given (single) distribution. ==> Q2

e.g. I have four threads and single state variable which generates a (hypothetical) distribution of random numbers as follows:

2 6 100 26 81 72 78 21 33 57 19 32 ...

-------- 1st Cycle ------ 2nd Cycle ------

Thread_1: - 2 --------- 81 -
Thread_2: - 6 --------- 72 -
Thread_3: - 100 ------ 78 -
Thread_4: - 26 -------- 21 -

Is this possible using a single state variable?

Upvotes: 0

Views: 226

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 151829

Q1: state is data. It represents all the data (such as seeds, sequence positions, subsequence positions, etc.) required to generate a random number.

A generator is a function. It operates on an instance of state, and creates a random number (according to the generator type), and also updates (modifies) the state.

Q2: We generate multiple states, so that multiple threads can simultaneously generate random numbers. It does not make sense to talk about a single state servicing multiple threads, if the threads are generating random numbers in parallel. Since the state is updated by the generator (see Q1) during the RNG process, you would have some threads reading from the state and other threads writing to the state, which would be a race condition, and would likely lead to corruption of the state.

What you are describing, where thread 1-4 generate 2,6,100, and 26 from a single state on the first cycle is not possible.

Upvotes: 1

Related Questions