scatman
scatman

Reputation: 14565

Allocating global memory

I have the following code which allocates global memory on the GPU.

__global__ void mallocTest()
{
    char* ptr = (char*)malloc(123);
    //....
    free(ptr);
}

Will every thread allocate memory for a separate ptr?
so if i have 2 blocks of 10 threads, then 20 arrays are allocated (ie every thread allocate memory for its own use)? How can i only allocate the memory per block instead of per thread? ie if i have 2 blocks and 10 threads, only 2 arrays are allocated. is this possible?

Upvotes: 2

Views: 643

Answers (1)

talonmies
talonmies

Reputation: 72372

If you execute that code on a compute capability 2.0 or 2.1 device, every thread will perform an allocation from the runtime global memory heap. So if your execution grid has 20 threads, you will get 20 allocations: one per thread.

If you wanted an array per block (and you want each thread in the block to access the same array), then the logical approach would be to use shared memory, especially if the memory is only used for the life of the block and isn't intended to be used again.

Upvotes: 3

Related Questions