user35358
user35358

Reputation: 371

What's the purpose of "AND AL,0xFF"?

I'm reading through a disassembled win32 c++ program and I see quite a few:

AND AL,0xFF

Is this completely pointless or why would the compiler generate these?

Here is a longer example:

movsx   eax, byte ptr [ebx]
shl     eax, 18h
movsx   edx, byte ptr [ebx+1]
shl     edx, 10h
add     eax, edx
movsx   ecx, byte ptr [ebx+2]
shl     ecx, 8
add     eax, ecx
movsx   edx, byte ptr [ebx+3]
add     eax, edx
xor     edx, edx
call    sub_43B55C
mov     ecx, eax
mov     edx, eax
sar     ecx, 10h
and     al, 0FFh      # <----
sar     edx, 8
and     cl, 0FFh      # <----
mov     [esi], cl
and     dl, 0FFh      # <----
mov     [esi+1], dl
mov     [esi+2], al
add     ebx, 4
add     esi, 3
inc     ebp
cmp     ebp, 6
jl      short loc_43B5E4

The flags aren't being checked after these operations so that can't be the purpose. After the AND, the values in AL, CL, and DL are being moved to [ESI + n].

Upvotes: 3

Views: 2457

Answers (1)

Daniel Kamil Kozar
Daniel Kamil Kozar

Reputation: 19306

As @fuz suggested, this is simply the fault of an optimizer not recognizing foo & 0xff as being a no-op in the context in which it was most probably used in the original function.

I compiled the following code snippet with Borland C++ Builder 6 after setting the project's compilation settings to "Release" :

unsigned char foobar(int foo) { return (foo >> 16) & 0xff; }

This resembles the operations carried out in the disassembly you provided quite closely. We have a 32-bit value which we want to shift a given number of bits and then turn it into a byte value, essentially returning bits 16-23 of the original value as one byte. The input parameter is of type int in order to generate a sar instruction instead of a shr : most probably an int was used in the original code as well.

After compiling and disassembling the resulting .obj with objconv (as I couldn't figure out how to enable assembly listings from within the C++ Builder's IDE), I got this :

@foobar$qi PROC NEAR
;  COMDEF @foobar$qi
        push    ebp                                     ; 0000 _ 55
        mov     ebp, esp                                ; 0001 _ 8B. EC
        mov     eax, dword ptr [ebp+8H]                 ; 0003 _ 8B. 45, 08
        sar     eax, 16                                 ; 0006 _ C1. F8, 10
        and     al, 0FFFFFFFFH                          ; 0009 _ 24, FF
        pop     ebp                                     ; 000B _ 5D
        ret                                             ; 000C _ C3
@foobar$qi ENDP

As you can see, the redundant and is still there. The 32-bit immediate in the disassembly can be disregarded, since the instruction's encoding clearly shows that the immediate in the actual code stream is 8-bit : there are no other valid options with an 8-bit register anyway.

Microsoft Visual Studio C++ 6 seems to be guilty of the same thing, but operates on the whole 32-bit register (thus generating 3 bytes more due to the 32-bit immediate), clearing the upper bits - which is needless, seeing how the return value of the function was explicitly declared to be 8-bit :

?foobar@@YAEH@Z PROC NEAR                               ; foobar
; 1    : unsigned char foobar(int foo) { return (foo >> 16) & 0xff; }
  00000 55               push    ebp
  00001 8b ec            mov     ebp, esp
  00003 8b 45 08         mov     eax, DWORD PTR _foo$[ebp]
  00006 c1 f8 10         sar     eax, 16                        ; 00000010H
  00009 25 ff 00 00 00   and     eax, 255               ; 000000ffH
  0000e 5d               pop     ebp
  0000f c3               ret     0
?foobar@@YAEH@Z ENDP                                    ; foobar

Meanwhile, the oldest version of gcc available on godbolt correctly compiles this into what's essentially just a shift, except the natural differences between the listings due to calling conventions.

Upvotes: 4

Related Questions