Reputation: 3
I'm writing a cuda c code to process pictures for example i created a swap function (swap blocs of the matrix) but it dos not work every time i thing i have a problem with the number of blocs and number of threads whene i lunch my kernel.
For example if i tak an image of size 2048*2048 with
threadsPerBlock.x=threadsPerBlock.y=64
and numBlocks.x=numBlocks.y=2048/threadsPerBlock.x
then swap<<<threadsPerBlock,numBlocks>>>(...)
works fine.
But if I take an image of size 2560*2160, threadsPerBlock.x=threadsPerBlock.y=64
and numBlocks.x=2560/64
and numBlocks.y=2160/64+1
, I have an error 9 wish is error invalid configuration argument.
I'm using CUDA 7.5 and a GPU with compute capability 5.0
Upvotes: 0
Views: 1065
Reputation: 72349
The maximum number of threads per block for your compute 5.0 device is 1024. The source of your problem is that you have the arguments in the kernel launch reversed. When the maximum dimension of the image is less than 2048, that gives you a launch with less than 1024 threads per block. Larger than 2048 and the block size becomes illegal
If you do something like this:
threadsPerBlock.x=threadsPerBlock.y=32
numBlocks.x=numBlocks.y=2048/threadsPerBlock.x
swap<<<numBlocks,threadsPerBlock>>>(...)
You should find the kernel launch works unconditionally.
Upvotes: 2