Reputation: 131
I am working on a CUDA project and I can't figure out how to reduce an array when its size isn't a power of 2.
There are many problems related to this on SO, however in my case the kernel has already been launched with a 2D block and 2D grid launch configuration and the array is in shared memory. I don't think padding is a option since the array has a size of 280 to 300 elements and I would have to pad up to 512 elements.
Is there any efficient algorithm for this?
Upvotes: 0
Views: 95