Hintron
Hintron

Reputation: 331

The modulo operation doesn't seem to work on a 64-bit value of all ones

So... the modulo operation doesn't seem to work on a 64-bit value of all ones.

Here is my C code to set up the edge case:

#include <stdio.h>

int main(int argc, char *argv[]) {
    long long max_ll =   0xFFFFFFFFFFFFFFFF;
    long long large_ll = 0x0FFFFFFFFFFFFFFF;
    long long mask_ll =  0x00000F0000000000;

    printf("\n64-bit numbers:\n");
    printf("0x%016llX\n", max_ll % mask_ll);
    printf("0x%016llX\n", large_ll % mask_ll);

    long max_l =   0xFFFFFFFF;
    long large_l = 0x0FFFFFFF;
    long mask_l =  0x00000F00;

    printf("\n32-bit numbers:\n");
    printf("0x%08lX\n", max_l % mask_l);
    printf("0x%08lX\n", large_l % mask_l);

    return 0;
}

The output shows this:

64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF

32-bit numbers:
0xFFFFFFFF
0x000000FF

What is going on here?

Why doesn't modulo work on a 64-bit value of all ones, but it will on a 32-bit value of all ones?

It this a bug with the Intel CPU? Or with C somehow? Or is it something else?

More Info

I'm on a Windows 10 machine with an Intel i5-4570S CPU. I used the cl compiler from Visual Studio 2015.

I also verified this result using the Windows Calculator app (Version 10.1601.49020.0) by going into the Programmer mode. If you try to modulus 0xFFFF FFFF FFFF FFFF with anything, it just returns itself.

Specifying unsigned vs signed didn't seem to make any difference.

Please enlighten me :) I actually did have a use case for this operation... so it's not purely academic.

Upvotes: 0

Views: 2282

Answers (3)

chqrlie
chqrlie

Reputation: 145277

Actually it does make a difference whether the values are defined as signed or unsigned:

#include <stdio.h>
#include <limits.h>

int main(void) {
#if ULLONG_MAX == 0xFFFFFFFFFFFFFFFF
    long long max_ll =   0xFFFFFFFFFFFFFFFF;  // converts to -1LL
    long long large_ll = 0x0FFFFFFFFFFFFFFF;
    long long mask_ll =  0x00000F0000000000;

    printf("\n" "signed 64-bit numbers:\n");
    printf("0x%016llX\n", max_ll % mask_ll);
    printf("0x%016llX\n", large_ll % mask_ll);

    unsigned long long max_ull =   0xFFFFFFFFFFFFFFFF;
    unsigned long long large_ull = 0x0FFFFFFFFFFFFFFF;
    unsigned long long mask_ull =  0x00000F0000000000;

    printf("\n" "unsigned 64-bit numbers:\n");
    printf("0x%016llX\n", max_ull % mask_ull);
    printf("0x%016llX\n", large_ull % mask_ull);
#endif

#if UINT_MAX == 0xFFFFFFFF
    int max_l =   0xFFFFFFFF;  // converts to -1;
    int large_l = 0x0FFFFFFF;
    int mask_l =  0x00000F00;

    printf("\n" "signed 32-bit numbers:\n");
    printf("0x%08X\n", max_l % mask_l);
    printf("0x%08X\n", large_l % mask_l);

    unsigned int max_ul =   0xFFFFFFFF;
    unsigned int large_ul = 0x0FFFFFFF;
    unsigned int mask_ul =  0x00000F00;

    printf("\n" "unsigned 32-bit numbers:\n");
    printf("0x%08X\n", max_ul % mask_ul);
    printf("0x%08X\n", large_ul % mask_ul);
#endif
    return 0;
}

Produces this output:

signed 64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF

unsigned 64-bit numbers:
0x000000FFFFFFFFFF
0x000000FFFFFFFFFF

signed 32-bit numbers:
0xFFFFFFFF
0x000000FF

unsigned 32-bit numbers:
0x000000FF
0x000000FF

64 bit hex constant 0xFFFFFFFFFFFFFFFF has value -1 when stored into a long long. This is actually implementation defined because of out of range conversion into a signed type, but on Intel processors, with current compilers, the conversion just keeps the same bit pattern.

Note that you are not using the fixed size integers defined in <stdint.h>: int64_t, uint64_t, int32_t and uint32_t. long long types are specified in the standard as having at least 64 bits, and on Intel x86_64, they do, and long has at least 32 bits, but for the same processor, the size differs between environments: 32 bits in Windows 10 (even in 64 bit mode) and 64 bits on MaxOS/10 and linux64. This is the reason why you observe surprising behavior for the long case where unsigned and signed may produce the same result. They don't on Windows, but they do in linux and MacOS because the computation is done in 64 bits and these values are just positive numbers.

Also note that LLONG_MIN / -1 and LLONG_MIN % -1 both invoke undefined behavior because of signed arithmetic overflow, and this one is not ignored on Intel PCs, it usually fires an uncaught exception and exits the program, just like 1 / 0 and 1 % 0.

Upvotes: 2

M.M
M.M

Reputation: 141648

Your program causes undefined behaviour by using the wrong format specifier.

%llX may only be used for unsigned long long. If you use the right specifier, %lld then the apparent mystery will go away:

#include <stdio.h>

int main(int argc, char* argv[])
{
    long long max_ll =   0xFFFFFFFFFFFFFFFF;
    long long mask_ll =  0x00000F0000000000;

    printf("%lld %% %lld = %lld\n", max_ll, mask_ll, max_ll % mask_ll);
}

Output:

-1 % 16492674416640 = -1

In ISO C the definition of the % operator is such that (a/b)*b + a%b == a. Also, for negative numbers, / follows "truncation towards zero".

So -1 / 16492674416640 is 0, therefore -1 % 16492674416640 must be -1 to make the above formula work.


As discussed in comments, the following line:

long long max_ll =   0xFFFFFFFFFFFFFFFF;

causes implementation-defined behaviour (assuming that your system has long long as a 64-bit type). The constant 0xFFFFFFFFFFFFFFFF has type unsigned long long, and it is out of range for long long whose maximum permitted value is 0x7FFFFFFFFFFFFFFF.

When an out-of-range assignment is made to a signed type, the behaviour is implementation-defined, which means the compiler documentation must say what happens.

Typically, this will be defined as generating the value which is in range of long long and has the same representation as the unsigned long long constant has. In 2's complement , (long long)-1 has the same representation as the unsigned long long value 0xFFFFFFFFFFFFFFFF, which explains why you ended up with max_ll holding the value -1.

Upvotes: 2

mihyar
mihyar

Reputation: 29

Try putting unsigned before your long long. As a signed number, your 0xFF...FF is actually -1 on most platforms.

Also, in your code, your 32-bit numbers are still 64-bits (you have them declared as long long as well).

Upvotes: 1

Related Questions