user8675309
user8675309

Reputation: 11

Which memory space does cudaMalloc allocate memory in?

If I understand correctly, CUDA devices have a few different memory spaces. (e.g. register, local, shared, global, etc). When calling cudaMalloc(), which memory space does allocated memory reside?

For example:

__global__ mykernel (void *p) {
    /* What memory space does p point to? */
    printf("p: %p\n", p);
}

int main() {
    void *p;
    assert(cudaMalloc (&p, 1024) == CUDA_SUCCESS);
    mykernel<<<1,1024>>> (p);
}

The documentation does not mention at what level the memory is allocated. It only says

Allocates size bytes of linear memory on the device and returns a pointer to the allocated memory. The allocated memory is suitably aligned for any kind of variable. The memory is not cleared.

It seems the memory would have to reside in one of global/constant/texture spaces, but which one?

Is it also safe to assume the memory will never be in local/register/shared memory space?

Upvotes: 1

Views: 951

Answers (1)

Oblivion
Oblivion

Reputation: 7374

global

cudaMalloc allocates in global memory. The other method for global memory allocation is using new and delete inside kernel.

__global__ void myKernel(int N)
{
     int* a = new int[N]; // not recommended
     delete [] a;
}

shared

For dynamic shared memory you use sth like

extern __shared__ int s[];

And launch kernel like

myKernel<<<1,n,n*sizeof(int)>>();

Or just __shared__ int s[4]; (inside kernel) for static shared memory


register

And for register you can think of automatic allocation in C++(just from syntax point of view):

int example = 0;
int moreExample[4]

the main difference is if you run out of register memory you will have register spilling and the variable may end up in global memory instead of register.

Upvotes: 3

Related Questions