Reputation: 20620
As I am newly learning CUDA this late, bank conflicts seemed to be one of the restrictions in CUDA devices and should be carefully thought through. But while reading compute capability 3.0 in CUDA programming guide, I found
"A shared memory request for a warp does not generate a bank conflict between two threads that access any sub-word within the same 64-bit word (even though the addresses of the two sub-words fall in the same bank): In that case, for read accesses, the 64-bit word is broadcast to the requesting threads and for write accesses, each sub-word is written by only one of the threads (which thread performs the write is undefined)."
Does this mean that we can ignore bank conflicts for a CUDA application in CC 3.0 or higher?
Upvotes: 1
Views: 133
Reputation: 20620
I guess I found an answer. It's not totally free of all bank conflicts in CC 3.0 but multiple threads can now access sub-words without any bank conflict. I believe this would greatly reduce programming effort for bank conflicts, especially on my projects.
Upvotes: 1