user1999728
user1999728

Reputation: 903

CUDA out of resources when trying to launch through MATLAB

After fixing the code I posted here (adding *sizeof(float) to shared memory allocation - but It doesn't matter since here I allocate shared memory through MATLAB), I ran the code, which successfully returned results of size up to sizeof(float)*18*18*5000*100 bytes.

I took the PTX, and used it to run the code though MATLAB (It found the right entry point - the function I wanted to run)

    kernel=parallel.gpu.CUDAKernel('Tst.ptx','float *,const float *,int');
    mask=gpuArray.randn([7,7,1],'single');
    toConv=gpuArray.randn([12,12,5],'single'); %%generate random data for testing
    setConstantMemory(kernel,'masks',mask);  %%transfer data to constant memory.
    kernel.ThreadBlockSize=[(12+2*7)-2 (12+2*7)-2 1];
    kernel.GridSize=[1 5 1]; %%first element is how many convolution masks
    %%second one is how many matrices we want to convolve
    kernel.SharedMemorySize=(24*24*4);
    foo=gpuArray.zeros([18 18 5 1],'single'); %%result size
    foo=reshape(foo,[numel(foo) 1]);
    toConv=reshape(toConv,[numel(toConv) 1]);
    foo=feval(kernel,foo,toConv,12);

I get:

Error using parallel.gpu.CUDAKernel/feval An unexpected error occurred trying to launch a kernel. The CUDA error was: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

Error in tst (line 12) foo=feval(kernel,foo,toConv,12);

out of resources for such a small example? It worked for a problem a hundred thousand times larger in Visual Studio...

I have GTX 480 (compute 2.0, about 1.5 GB memory, 1024 max threads per block, 48K shared memory)

1>  ptxas : info : 0 bytes gmem, 25088 bytes cmem[2]
1>  ptxas : info : Compiling entry function '_Z6myConvPfPKfi' for 'sm_21'
1>  ptxas : info : Function properties for _Z6myConvPfPKfi
1>      0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1>  ptxas : info : Used 10 registers, 44 bytes cmem[0]

EDIT: problem resolved by compiling with Configuration Active(Release) and Platform Active(x64)

Upvotes: 1

Views: 911

Answers (1)

user1999728
user1999728

Reputation: 903

problem resolved by compiling with Configuration Active(Release) and Platform Active(x64) instead of default (Due to backwards compatibility, I'm guessing it's not about the x64 as much as about compiling for release and not for debug)

Upvotes: 1

Related Questions