Reputation: 1
I have been trying to solve this using assembly language. The thing is I cannot store the 10! In al
and my code works for finding factorial of 5. How do I store my result of 10! In a register? When I find factorial of 5 I can see the result clearly in al
because 120 can be stored in al
.
Any help would be appreciated.
Here's my code for 5!
org 100h
.DATA
ANS DB ?
.CODE
MAIN PROC
MOV AX,@DATA
MOV DS,AX
MOV AL,5
MOV CL,4
MOV BL,AL
SUB BL,1
L:
MUL BL
SUB BL,1
LOOP L
MOV ANS,AL
END MAIN
ret
Upvotes: 0
Views: 6678
Reputation: 1
It's because AL
, BL
, and CL
are of 8 bits, but what you require is a 16-bit register. This can be done by replacing all L
with their X
counterparts. Although doing so won't help in storing 10! (which requires even more than 16-bits), but we can find 8!, for example (i.e. anything which can be stored in 16-bits).
org 100h
.DATA
ANS DB ?
.CODE
MAIN PROC
MOV AX, @DATA
MOV AX, 8
MOV CX, 7
MOV BX, AX
SUB BX, 1
LOOP_LB:
MUL BX
SUB BX, 1
LOOP LOOP_LB
END MAIN
ret
The screenshot of the output:
Upvotes: -1
Reputation: 37224
10!
is 10*9*8*7*6*5*4*3*2*1
.
For speed you can minimize the size of intermediate values by doing them in "big * little" pairs, like (1*10) * (2*9) * (3*8) * (4*7) * (5*6)
.
You can describe these pairs as the expression temp = x * (11 - x)
(with values of x
from 1 to 5). If you know the result for x
(e.g. temp = 10
if x == 1
) then the result of the next pair can be derived from the result of the previous pair. E.g.:
temp1 = x * (11 - x)
temp2 = (x+1) * (11 - (x+1))
= (x+1) * (11 - x - 1)
= x * (11 - x - 1) + (11 - x - 1)
= x * (11 - x) - x + (11 - x - 1)
= temp1 - x + (11 - x - 1)
= temp1 - x - x + 11 -1
= temp1 - (x << 1) + 10
In other words; if you know that the result of "pair 1" (or 1*10
) will be 10, then you can calculate the result of the next pair (or 2*9
) by doing (10) - (1<<1) + 10 = 18;
, then calculate the result of the next pair by doing (18) - (2<<1) + 10 = 24
, then...
Note that there is no multiplication involved for calculating the pairs this way. After calculating the intermediate results from pairs (with add/sub alone), you end up with "10! = 10*18*24*28*30
".
For more fun; you can prove that all of these intermediate results will always be even. That means you can cheat and say "10! = 10*18*24*28*30 = 5*9*12*14*15 << (1+1+1+1+1) = 5*9*12*14*15 << 5
"; which is important when you're looking at doing more expensive "too big" multiplications later.
Because these intermediate values fit in 8 bits, we can do "pairs of pairs" just with two 16-bit multiplications. "10! = (5*9) * (12*14) * 15 << 5 = 45 * 168 * 15 << 5
".
By multiplying the smallest intermediate values the result will still fit in 16 bits. "10! = 45 * 168 * 15 << 5 = (45*15) * 168 << 5 = 675 * 168 << 5
".
Now you only need to do one multiplication where the result is 32 bits but both operands are 16 bit. Real mode handles this perfectly fine with a single mul
instruction so that is no problem; it's just that the result will be stored in dx:ax
. Specifically, for 675 * 168 the result in dx:ax
will be 113400 (or 0x1BAF8, or 0x0001 in dx
and 0xBAF8 in ax
).
This gives us "10! = 113400 << 5
". For this it'll be a little awkward on an old "16-bit only" CPU (where you can't just use 32-bit instructions in real mode, including 32-bit mul
). The best way for 8086 is probably using another register for a "right shift then or
" to get the bits from ax
into dx
; like:
mov cl,5
mov bx,ax
shl dx,cl
shl ax,cl
mov cl,16-5
shr bx,cl
or dx,bx
For 80186 and later you can do it slightly better using "shift by immediate" to avoid using cl
, like shl dx,5
, but that isn't supported for 8086.
The result will be 3628800, which is the right answer.
Note that this same approach can be tweaked to handle other factorials - for odd factorials you can ignore the *1
that doesn't actually do anything to ensure the intermediate results from pairs are still even (e.g. "11! = 1 * (2*11) * (3*10) * (4*9) * (5*8) * (6*7) = (2*11) * (3*10) * (4*9) * (5*8) * (6*7)
"). For all factorials (n!
); the intermediate calculation of pairs can be described as temp2 = temp1 - (x << 1) + (n & (~1))
(I think - didn't actually check); and the final shift will always be n/2
rounded down (e.g. for 11!
the shift count will be "(int)11/2 = (int)5.5 = 5
).
With 16-bit multiplication alone it should be good up until 15!. For 16! and larger, the same techniques work to minimize the number of 32-bit multiplications needed.
Upvotes: 2
Reputation: 14399
I tried to explain the algorithm to multiply a dword by a word in a 16-bit environment here: https://stackoverflow.com/a/28683170/3512216
An implementation that computes 10! and works also in EMU8086:
.MODEL SMALL
.STACK 1000h
.DATA
decstr DB 16 DUP ('$') ; String is $-terminated
.CODE
main PROC
mov ax, @DATA ; Initialize DS
mov ds, ax
mov bx, 10 ; Factorial 10! = 3.628.800
xor dx, dx ; DX:AX=1 (first multiplicand)
mov ax, 1 ; Begin with 1
; for (dx:ax = 1, cx = 2; cx <= 10; cx++)
mov cx, 2 ; Incrementing multiplicator
L1:
call mul_dword_word ; DX:AX * CX -> DX:AX
inc cx
cmp cx, bx
jbe L1 ; While cx <= 10
; Print result
mov di, OFFSET decstr
call dword_to_dec
mov dx, OFFSET decstr
mov ah, 9
int 21h
; Exit
mov ax, 4C00h
int 21h
main ENDP
mul_dword_word PROC ; DX:AX multiplicand, CX multiplier
push dx
mul cx ; AX * CX -> DX:AX
mov si, dx ; Store high result
mov di, ax ; Low result won't be changed anymore
pop ax ; High word
mul cx ; AX * CX -> DX:AX
add ax, si ; Add high result from last mul to low result here
adc dx, 0
mov si, dx ; SI:DX:AX return value
mov dx, ax
mov ax, di
ret ; RET: SI:DX:AX result
mul_dword_word ENDP
dword_to_dec PROC ; ARG DX:AX DWORD, DI: offset of string
mov cs:target, di
mov si, ax
mov di, dx
; First Loop: get digits and push them
mov cs:counter, 0
mov bx, 10
LL1:
inc cs:counter
xor dx, dx
mov ax, di ; High WORD
mov cx, ax
div bx ; DX:AX / BX -> AX Remainder DX
mov di, ax ; Store new high word
mul bx ; AX * BX -> DX:AX
sub cx, ax ; sub highest CX-divisible value
mov dx, cx
mov ax, si ; Low WORD
div bx ; DX:AX / BX -> AX Remainder DX
or dl, 30h ; Convert remainder to ASCII
push dx ; Store remainder
mov si, ax ; Store new low WORD
or ax, di ; Anything more to process?
jnz LL1 ; yes: jump to LL1 above
; Second Loop: get back digits in reversed order
mov di, cs:target
mov cx, cs:counter
LL2:
pop ax
mov [di], al
inc di
loop LL2
mov BYTE PTR [di], '$' ; Terminator for INT 21h/09h
ret
counter dw 0
target dw 0
dword_to_dec ENDP
Upvotes: 2