Sayan
Sayan

Reputation: 99

Is 32 bit comparison faster than 64 bit comparison?

Is the comparison of 32 bits faster than the comparison of 64 bits?

I was looking at this file http://www.netlib.org/fdlibm/s_cos.c

They have this piece of code

    /* |x| ~< pi/4 */
    ix &= 0x7fffffff;
    if(ix <= 0x3fe921fb) return __kernel_cos(x,z);

I understand the first line, which calculates the absolute value of x. But why is the comparison so complicated? Is there any improvement in performance by comparing the first 32 bits and not all 64 bits? Could I write

long unsigned int ix = *(long unsigned int *)(&x);
ix &= 0x7fffffffffffffff;
if (ix < 0x3fe921fb54442d18) 
/* What comes next */ ;

and expect the same performance in terms of speed on a 64-bit machine? Though I agree this would consume more memory.

0x3fe921fb54442d18 is pi/2.

Upvotes: 5

Views: 749

Answers (1)

CPlus
CPlus

Reputation: 4848

On my 64-bit Intel machine with Apple clang version 12.0.0 I tried running ix <= 0x3fe921fb where ix is int typed over a billion times several times. I then tried running ix < 0x3fe921fb54442d18 with a unsigned long typed ix the same number of times. Here are the results in seconds:

No optimization 32-bit:

1.470922
1.448247
1.449718
1.446084
1.450020
1.453608

No optimization 64-bit:

1.567637
1.561653
1.565024
1.575094
1.567794
1.564141

-O3 32 bit:

0.421903
0.419469
0.425281
0.419894
0.425790
0.424800

-03 64-bit:

0.636965
0.640522
0.637279
0.634344
0.634989
0.633755

The 32-bit comparisons, at least on my machine, are consistently slightly faster.

I would go for the first option. Faster, even on 64-bit machines, regardless of optimization setting, and also uses less memory, and power, as the 32-bit comparison circuits are smaller and use less energy. Please also note that long unsigned int ix = *(long unsigned int *)(&x); with a double typed x is technically undefined behavior.

Test Code:

volatile int ix = 0; // Changing this value has no effect
clock_t before = clock();
for (int i = 1<<30; i--;) {
    volatile int sink = ix <= 0x3fe921fb;
}
printf("%f\n", (double)(clock()-before)/CLOCKS_PER_SEC);
volatile unsigned long ix = 0;
clock_t before = clock();
for (int i = 1<<30; i--;) {
    volatile int sink = ix < 0x3fe921fb54442d18;
}
printf("%f\n", (double)(clock()-before)/CLOCKS_PER_SEC);

Upvotes: 0

Related Questions