Reputation: 1778
When looking into the SASS output generated for the NVIDIA Fermi architecture, the instruction IADD.X is observed. From NVIDIA documentation, IADD means integer add, but not understanding what it means by IADD.X. Can somebody please help... Is this meaning an integer addition with extended number of bits?
The instruction snippet is:
IADD.X R5, R3, c[0x0][0x24]; /* 0x4800400090315c43 */
Upvotes: 2
Views: 527
Reputation: 26095
Yes, the .X
stands for eXtended precision. You will see IADD.X
used together with IADD.CC
, where the latter adds the less significant bits, and produces a carry flag (thus the .CC
), and this carry flag is then incorporated into addition of the more significant bits performed by IADD.X
.
Since NVIDIA GPUs are basically 32-bit processors with 64-bit addressing capability, a frequent use of this idiom is in address (pointer) arithmetic. The use of 64-bit integer types, such as long long int
or uint64_t
will likewise lead to the use of these instructions.
Here is a worked example of a kernel doing 64-bit integer addition. This CUDA code was compiled for compute capability 3.5 with CUDA 7.5, and the machine code dumped with cuobjdump --dump-sass
.
__global__ void addint64 (long long int a, long long int b, long long int *res)
{
*res = a + b;
}
MOV R1, c[0x0][0x44];
MOV R2, c[0x0][0x148]; // b[31:0]
MOV R0, c[0x0][0x14c]; // b[63:32]
IADD R4.CC, R2, c[0x0][0x140]; // tmp[31:0] = b[31:0] + a[31:0]; carry-out
MOV R2, c[0x0][0x150]; // res[31:0]
MOV R3, c[0x0][0x154]; // res[63:32]
IADD.X R5, R0, c[0x0][0x144]; // tmp[63:32] = b[63:32] + a[63:32] + carry-in
ST.E.64 [R2], R4; // [res] = tmp[63:0]
EXIT
Upvotes: 2