Yik
Yik

Reputation: 525

Does replacing int with short help the performance in CUDA

assume that we have enough global memory. Does replacing int with short improve the performance in CUDA? (like short saves the usage of shared memory, registers, etc)

Advices are welcomed. Thanks.

Upvotes: 3

Views: 1970

Answers (3)

fabmilo
fabmilo

Reputation: 48330

Depends:

If your program is memory bound then Yes transferring the input as shorts could be beneficial.

If your kernel is computation bound is more likely to be No because the kernel have to do an extra operation to convert from short to int and then back to short each time.

Upvotes: 4

ArchaeaSoftware
ArchaeaSoftware

Reputation: 4422

Tesla-class hardware (SM 1.x) has surprisingly rich support for "half registers," so you might get some mileage from using short instead of int on those platforms. You can confirm by using cuobjdump to look at the microcode in the cubin. But Fermi removed that support.

With SM 2.1, NVIDIA added support for "video" instructions that implement 32-bit-wide SIMD operations on 32-bit registers - see section 8.7.9 of the PTX 2.1 spec.

http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/ptx_isa_2.1.pdf

Upvotes: 1

aland
aland

Reputation: 5209

Using short in shared memory will most likely reduce performance due to bank-conflicts, until you use short2.

Also, as far as I know, all registers on GPU are 32-bit, so it's unlikely that using short would reduce register usage.

Upvotes: 4

Related Questions