Negative hexadecimal representation

Question

I had a question which was like this:

 mov r15, 0x407116EF3867BBCA  
 sub r15, 0x95F67F70A1BCEE9D
What value is in r15 after this code executes?
(Be careful about integer overflow/underflows)

Solution: python 3 math string for debugging: (i64 - i64_2) % (2**64) and when I solved this question as per this formula I got 0xaa7a977e96aacd2d.

My doubt is that answer should have been negative but I think it has been converted in 2's complement. But what is the need to convert it into 2's complement if that's the case? So, am I thinking in the right direction or not? If not then please correct me. Also, what's the logic behind this formula (taking modulo by 2^64).

Peter Cordes · Accepted Answer

Yes, in assembly, a 64-bit register value can always be represented by 16 hex digits. There isn't a separate plus/minus bit; it's best to think of binary operations like add / sub as unsigned¹, and only worry about the 2's complement interpretation of the final result².

That means it's slightly inconvenient to use arbitrary precision calculators like Python 3 integers or calc aka apcalc which show you negative results instead of wrapping like C uint64_t when a subtraction underflows past 0.

There are 2 ways to think about fixing the result to be what you want (in the [0 .. 2**64-1] range):

Modulo reduce it back into that range. That's what % in Python does. (Unlike some other languages like C, or the idiv instruction in x86-64 asm, Python % gives you the modulus, always positive, not the remainder. e.g. -1 % 2 is 1, but in C for signed int it's -1.

You can even manually do that reduction by adding 2**64 to a negative number, to get the 2's complement binary representation. Because you know that any add or sub result is going to be no less than 2**64 outside that range, so only one addition (or subtraction for a carry-out add) will be ncessary.
Bitwise truncate it to 64-bit, taking the low 64 bits of Python's internal extended-precision 2's complement representation. This depends on Python using 2's complement internally, which is probably guaranteed, and certainly works in practice (presumably at least when Python is running on any normal system that itself uses 2's complement.)

All three give the same correct results. In an interactive Python 3.9 session:

>>> (0x407116EF3867BBCA - 0x95F67F70A1BCEE9D)
-6162446570153652947
>>> hex (0x407116EF3867BBCA - 0x95F67F70A1BCEE9D)
'-0x55856881695532d3'

>>> hex( (0x407116EF3867BBCA - 0x95F67F70A1BCEE9D) % (2**64) )
'0xaa7a977e96aacd2d'
>>> hex( (0x407116EF3867BBCA - 0x95F67F70A1BCEE9D) + 2**64 )
'0xaa7a977e96aacd2d'

>>> hex( (0x407116EF3867BBCA - 0x95F67F70A1BCEE9D) & (2**64-1) )
'0xaa7a977e96aacd2d'

Also, if this is supposed to be x86-64, that sub won't assemble: only mov can use a 64-bit immediate, and 0x95F67F70A1BCEE9D doesn't fit in (isn't representable as) a 32-bit sign-extended immediate.

But if it did, then CF would be set because there's borrow out of the high bit (because 0x4... - 0x9... = 0xa... wrapped past zero: the left-hand operand of the subtraction was unsigned below the right-hand side).

And OF would be set, because in a signed interpretation (where we look at the MSB as having a place-value of - 2^63 instead of + 2^63), a positive number minus a larger-magnitude negative number produced such a large positive result that it overflowed to negative. (i.e. positive - negative = negative implies signed overflow, just like positive + positive = negative would)

And SF=1 according to the MSB of the result.

Footnote 1:
The high half of widening multiply, and division in general, care about the place-value of the MSB; add/sub don't: 2's complement addition is the same binary operation as unsigned, including wrapping around. That's why x86-64 only has one sub instruction, but has div and idiv, and for one-operand widening multiply has mul separate from the usual imul. But the non-widening forms of imul like imul eax, r9d are the same for signed or unsigned.

Footnote 2:
If the high bit is set, then it's negative if you interpret the bit-pattern as a 2's complement signed integer, instead of unsigned. See Wikipedia's article about 2's complement. If the first hex digit is 8 to F, the high bit is set, so in your case 0x9... and 0xa... represent negative numbers, while 0x4... represents a positive number.

Negative hexadecimal representation

Answers (1)

Related Questions