Reputation: 331
So... the modulo operation doesn't seem to work on a 64-bit value of all ones.
Here is my C code to set up the edge case:
#include <stdio.h>
int main(int argc, char *argv[]) {
long long max_ll = 0xFFFFFFFFFFFFFFFF;
long long large_ll = 0x0FFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("\n64-bit numbers:\n");
printf("0x%016llX\n", max_ll % mask_ll);
printf("0x%016llX\n", large_ll % mask_ll);
long max_l = 0xFFFFFFFF;
long large_l = 0x0FFFFFFF;
long mask_l = 0x00000F00;
printf("\n32-bit numbers:\n");
printf("0x%08lX\n", max_l % mask_l);
printf("0x%08lX\n", large_l % mask_l);
return 0;
}
The output shows this:
64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF
32-bit numbers:
0xFFFFFFFF
0x000000FF
What is going on here?
Why doesn't modulo work on a 64-bit value of all ones, but it will on a 32-bit value of all ones?
It this a bug with the Intel CPU? Or with C somehow? Or is it something else?
I'm on a Windows 10 machine with an Intel i5-4570S CPU. I used the cl
compiler from Visual Studio 2015.
I also verified this result using the Windows Calculator app (Version 10.1601.49020.0) by going into the Programmer mode. If you try to modulus 0xFFFF FFFF FFFF FFFF
with anything, it just returns itself.
Specifying unsigned vs signed didn't seem to make any difference.
Please enlighten me :) I actually did have a use case for this operation... so it's not purely academic.
Upvotes: 0
Views: 2282
Reputation: 145277
Actually it does make a difference whether the values are defined as signed
or unsigned
:
#include <stdio.h>
#include <limits.h>
int main(void) {
#if ULLONG_MAX == 0xFFFFFFFFFFFFFFFF
long long max_ll = 0xFFFFFFFFFFFFFFFF; // converts to -1LL
long long large_ll = 0x0FFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("\n" "signed 64-bit numbers:\n");
printf("0x%016llX\n", max_ll % mask_ll);
printf("0x%016llX\n", large_ll % mask_ll);
unsigned long long max_ull = 0xFFFFFFFFFFFFFFFF;
unsigned long long large_ull = 0x0FFFFFFFFFFFFFFF;
unsigned long long mask_ull = 0x00000F0000000000;
printf("\n" "unsigned 64-bit numbers:\n");
printf("0x%016llX\n", max_ull % mask_ull);
printf("0x%016llX\n", large_ull % mask_ull);
#endif
#if UINT_MAX == 0xFFFFFFFF
int max_l = 0xFFFFFFFF; // converts to -1;
int large_l = 0x0FFFFFFF;
int mask_l = 0x00000F00;
printf("\n" "signed 32-bit numbers:\n");
printf("0x%08X\n", max_l % mask_l);
printf("0x%08X\n", large_l % mask_l);
unsigned int max_ul = 0xFFFFFFFF;
unsigned int large_ul = 0x0FFFFFFF;
unsigned int mask_ul = 0x00000F00;
printf("\n" "unsigned 32-bit numbers:\n");
printf("0x%08X\n", max_ul % mask_ul);
printf("0x%08X\n", large_ul % mask_ul);
#endif
return 0;
}
Produces this output:
signed 64-bit numbers:
0xFFFFFFFFFFFFFFFF
0x000000FFFFFFFFFF
unsigned 64-bit numbers:
0x000000FFFFFFFFFF
0x000000FFFFFFFFFF
signed 32-bit numbers:
0xFFFFFFFF
0x000000FF
unsigned 32-bit numbers:
0x000000FF
0x000000FF
64 bit hex constant 0xFFFFFFFFFFFFFFFF
has value -1
when stored into a long long
. This is actually implementation defined because of out of range conversion into a signed type, but on Intel processors, with current compilers, the conversion just keeps the same bit pattern.
Note that you are not using the fixed size integers defined in <stdint.h>
: int64_t
, uint64_t
, int32_t
and uint32_t
. long long
types are specified in the standard as having at least 64 bits, and on Intel x86_64, they do, and long
has at least 32 bits, but for the same processor, the size differs between environments: 32 bits in Windows 10 (even in 64 bit mode) and 64 bits on MaxOS/10 and linux64. This is the reason why you observe surprising behavior for the long
case where unsigned
and signed
may produce the same result. They don't on Windows, but they do in linux and MacOS because the computation is done in 64 bits and these values are just positive numbers.
Also note that LLONG_MIN / -1
and LLONG_MIN % -1
both invoke undefined behavior because of signed arithmetic overflow, and this one is not ignored on Intel PCs, it usually fires an uncaught exception and exits the program, just like 1 / 0
and 1 % 0
.
Upvotes: 2
Reputation: 141648
Your program causes undefined behaviour by using the wrong format specifier.
%llX
may only be used for unsigned long long
. If you use the right specifier, %lld
then the apparent mystery will go away:
#include <stdio.h>
int main(int argc, char* argv[])
{
long long max_ll = 0xFFFFFFFFFFFFFFFF;
long long mask_ll = 0x00000F0000000000;
printf("%lld %% %lld = %lld\n", max_ll, mask_ll, max_ll % mask_ll);
}
-1 % 16492674416640 = -1
In ISO C the definition of the %
operator is such that (a/b)*b + a%b == a
. Also, for negative numbers, /
follows "truncation towards zero".
So -1 / 16492674416640
is 0
, therefore -1 % 16492674416640
must be -1
to make the above formula work.
As discussed in comments, the following line:
long long max_ll = 0xFFFFFFFFFFFFFFFF;
causes implementation-defined behaviour (assuming that your system has long long
as a 64-bit type). The constant 0xFFFFFFFFFFFFFFFF
has type unsigned long long
, and it is out of range for long long
whose maximum permitted value is 0x7FFFFFFFFFFFFFFF
.
When an out-of-range assignment is made to a signed type, the behaviour is implementation-defined, which means the compiler documentation must say what happens.
Typically, this will be defined as generating the value which is in range of long long
and has the same representation as the unsigned long long
constant has. In 2's complement , (long long)-1
has the same representation as the unsigned long long
value 0xFFFFFFFFFFFFFFFF
, which explains why you ended up with max_ll
holding the value -1
.
Upvotes: 2
Reputation: 29
Try putting unsigned
before your long long
. As a signed number, your 0xFF...FF is actually -1 on most platforms.
Also, in your code, your 32-bit numbers are still 64-bits (you have them declared as long long
as well).
Upvotes: 1