Reputation: 1449
I had a university supervision assigment to write a function in assembler, which
would take three 32 bit unsigned numbers a, b, d and
return the result (a * b) / d
The c declaration of this function is:
unsigned int muldiv(const unsigned int a,
const unsigned int b,
const unsigned int d);
Note that we want to be sure that no unnecessary overflow or underflow happens. For example,
if a = 2^31, b = 2^31, d = 2^31,
the answer should be 2^31, despite the fact that a * b would overflow. (See more clarification on that below)
Now I wrote a simple function in c which works, then compiled it to machine code, then disassembled back to assembly code, finally removed some unnecessary instructions.
My final piece of assembly code is:
muldiv:
imulq %rsi, %rax
xorl %edx, %edx
divq %rcx
ret
Which works when compiled to executable code and checked on several test cases. However, I do not properly understand, what's happening in this code.
Hence, could anyone explain me why this code works (or perhaps it does not?), in particular:
why xorl %edx, %edx is necessary? When removed, I get a runtime error.
How does it make multiplication on long long numbers using only one instruction, if machine can operate only on 32 bit numbers?
Clarification of overflow and underflow: This function should return the result as if we're operating on unsigned 64bit numbers. The code in c is as follows:
// NOTE: when compiled to assembly code, I removed A LOT of instructions,
// but it still works
unsigned int muldiv(const unsigned int a,
const unsigned int b,
const unsigned int d) {
const unsigned long long la = a;
const unsigned long long lb = b;
const unsigned long long ld = d;
const unsigned long long ab = la * lb;
const unsigned long long ab_over_d = ab / ld;
return (unsigned int) ab_over_d;
}
And it worked, when called in such a way:
#include "muldiv.h"
int main(void) {
unsigned int a = (1 << 31);
unsigned int b = (1 << 31);
unsigned int d = (1 << 31);
unsigned int result = muldiv(a, b, d);
printf("%u\n", result); // prints (1 << 31), which is correct.
return 0;
}
Upvotes: 1
Views: 333
Reputation: 58762
why divq %rcx instruction uses only one register? I assume this is the division part, so how does it know which two arguments to use?
It uses implicit 128 bit dividend from register pair rdx:rax
. See the instruction set reference for the description of the operation.
how does it know that when I call muldiv from another place, the arguments a, b and d are stored in the registers %rsi / %rax / e.t.c, not somewhere else?
That's defined by the calling convention. See wikipedia for a summary.
why xorl %edx, %edx is necessary? When removed, I get a runtime error.
See point #1, above. rdx
has he top 64 bits of the dividend, so it should be cleared for unsigned division.
How does it make multiplication on long long numbers using only one instruction, if machine can operate only on 32 bit numbers?
You used 64 bit mode, so that's not true. Also, the mul and div instructions come in double width flavor, so you even get 128 bit versions in 64 bit mode, and 64 bit versions in 32 bit mode. Again, see the instruction set reference.
Upvotes: 4
Reputation: 2769
It's a bit of a partial answer I'm afraid, I'll need to look at it again when I have a moment (work calls!), but hopefully helps :-
why divq %rcx instruction uses only one register? I assume this is the division part, so how does it know which two arguments to use?
Divides unsigned the value in the AX, DX:AX, EDX:EAX, or RDX:RAX registers (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, EDX:EAX, or RDX:RAX registers. The source operand can be a general-purpose register or a memory location. The action of this instruction depends on the operand size (dividend/divisor). Division using 64-bit operand is available only in 64-bit mode.
Non-integral results are truncated (chopped) towards 0. The remainder is always less than the divisor in magnitude. Overflow is indicated with the #DE (divide error) exception rather than with the CF flag
how does it know that when I call muldiv from another place, the arguments a, b and d are stored in the registers %rsi / %rax / e.t.c, not somewhere else?
This is compiler and environment specific. You need to look up the ABI for your environment: What is Application Binary Interface (ABI)?
why xorl %edx, %edx is necessary? When removed, I get a runtime error.
It zeros the DX register.
How does it make multiplication on long long numbers using only one instruction, if machine can operate only on 32 bit numbers?
The RAX register is a 64bit register. So why do you think the machine is operating on 32-bit numbers? Maybe your test cases need expanding?
Upvotes: 4