Reputation: 1
I need some help in AT&T assembly again, I've load some data into memory like below (hex and dec).
(gdb) x/8xb &buffer_in
0x8049096: 0x03 0x02 0x10 0x27 0xe8 0x03 0x64 0x00
(gdb) x/8db &buffer_in
0x8049096: 3 2 16 39 -24 3 100 0
Lets say that first byte = number count, second = each number length in bytes and then we got (first * second) bytes of numbers. For this example, 3 numbers, 2 bytes each, first number is 16 39 and so one. I would like to add each number, so in this case it would be adding 0x10 + 0xe8 (lower byte) to result[0] then 0x27 + 0x03 to result[1] and then again, result[0] = result[0] + 0x64 and finally result[1] = result[1] + 0x00.
When I'm adding 0x64 to result[0] which already contains 0xf8, the CF (carry flag) is set, and that's great of course because I would like to use this carry in result[1] next addition. But the problem is that after next CMP instruction (I'll mark it on code below) this carry flag is cleared so the final result is 0x5C2A (when I combine two bytes of result) and should be 0x5C2B (but the carry flag didnt affect the addition due to cmp instruction).
%eax - amount of numbers to sum
%ecx - length of each number in bytes
%esi - before loops start is pointing to first byte of 'real' data (0x10 in this case)
loop1:
movl $0, %ebx
loop2:
leal (%esi, %ebx, 1), %edi
movb (%edi), %dl # %dl contain now next byte to add
adc %dl, result(%ebx) # adding to result
inc %ebx
cmp %ebx, %ecx # this comparsion clears CF flag and that's the problem
JG loop2
leal (%esi, %ecx, 1), %esi
dec %al
cmp $0, %al
JG loop1
Upvotes: 0
Views: 801
Reputation: 76537
If you just want to save the carry flag there are a few tricks for that.
push the flags
pushf //save the flags
...... do stuff
popf //restore the flags
save CF in a register
//save CF in eax
sbb eax,eax //CF=1 -> CF=1, regx=-1; CF=0 -> CF=0, regx=0, clobbers other flags
//note that sbb reg, reg preserves! CF, how cool is that!
.... do stuff, do not alter eax
add eax,1 //restore CF
rewrite the loop so it counts down up to zero
loop1:
mov ebx,ecx //ebx = count
lea esi,[esi+ecx] //esi = end of buffer
neg ebx //ebx = -count
loop2:
//no need for the lea (the mov can do complex addressing)
mov dl,[esi+ebx] # %dl contain now next byte to add
adc [ecx+ebx+result],dl adding to result
inc ebx //ebx will be zero when done :-)
//no need for cmp
jnz loop2 //we only need ZF
Just in case you missed it. The trick works as follows.
First we add count to the base pointer.
Next we negate the count.
The loop thus starts at basepointer+count-count = basepointer
At every iteration we increase -count
.
This leads to the following effect in loop iteration n
: address = base+count-count+n
ergo: adr = base + n
.
When we are done -count+n
will be zero, and we don't need to do a cmp
because the inc
will adjust ZF
for as as needed without clobbering CF
.
Note that I only use Intel syntax as a matter of principle.
Upvotes: 0
Reputation: 16586
This is usually resolved by adjusting logic of algorithm to avoid any CF-changing instruction between add
and adc
, which may look actually a bit impossible at first sight, when you want loop over dynamic count of bytes.
But if you will read details about instructions INC
and DEC
, there's one interesting thing, which looks like weird inconsistency. They don't affect CF! (it was actually designed like that just because of similar use cases, like this one).
So your code may look like this (sorry for Intel+NASM syntax, I don't like AT&T, so convert on your own, at least you will know for sure you understand it well) (plus I didn't debug it, so it may have some bug, let me know in case there's a problem):
; zero the result data first
movzx edx,byte [buffer_in+1] ; element length
zero_result:
dec edx
mov [result+edx],byte 0
jnz zero_result
; now sum all elements
movzx ecx,byte [buffer_in+0] ; number of elements
lea esi,[buffer_in+2] ; source data ptr
elements_loop:
movzx edx,byte [buffer_in+1] ; element length
xor ebx,ebx ; offset of byte of element = 0 AND CF=0 (!)
element_byte_loop:
mov al,[esi] ; read source byte (no CF change)
inc esi ; ++ptr (CF preserved)
adc [result+ebx],al ; add it to result with CF
inc ebx ; next offset of byte inside element (CF preserved)
dec edx ; do all bytes of element (CF preserved)
jnz element_byte_loop
; single element added to result, now repeat it for all elements
dec ecx
jnz elements_loop
Upvotes: 1