Reputation: 371
I'm reading through a disassembled win32 c++ program and I see quite a few:
AND AL,0xFF
Is this completely pointless or why would the compiler generate these?
Here is a longer example:
movsx eax, byte ptr [ebx]
shl eax, 18h
movsx edx, byte ptr [ebx+1]
shl edx, 10h
add eax, edx
movsx ecx, byte ptr [ebx+2]
shl ecx, 8
add eax, ecx
movsx edx, byte ptr [ebx+3]
add eax, edx
xor edx, edx
call sub_43B55C
mov ecx, eax
mov edx, eax
sar ecx, 10h
and al, 0FFh # <----
sar edx, 8
and cl, 0FFh # <----
mov [esi], cl
and dl, 0FFh # <----
mov [esi+1], dl
mov [esi+2], al
add ebx, 4
add esi, 3
inc ebp
cmp ebp, 6
jl short loc_43B5E4
The flags aren't being checked after these operations so that can't be the purpose. After the AND
, the values in AL
, CL
, and DL
are being moved to [ESI + n]
.
Upvotes: 3
Views: 2457
Reputation: 19306
As @fuz suggested, this is simply the fault of an optimizer not recognizing foo & 0xff
as being a no-op in the context in which it was most probably used in the original function.
I compiled the following code snippet with Borland C++ Builder 6 after setting the project's compilation settings to "Release" :
unsigned char foobar(int foo) { return (foo >> 16) & 0xff; }
This resembles the operations carried out in the disassembly you provided quite closely. We have a 32-bit value which we want to shift a given number of bits and then turn it into a byte value, essentially returning bits 16-23 of the original value as one byte. The input parameter is of type int
in order to generate a sar
instruction instead of a shr
: most probably an int
was used in the original code as well.
After compiling and disassembling the resulting .obj with objconv (as I couldn't figure out how to enable assembly listings from within the C++ Builder's IDE), I got this :
@foobar$qi PROC NEAR
; COMDEF @foobar$qi
push ebp ; 0000 _ 55
mov ebp, esp ; 0001 _ 8B. EC
mov eax, dword ptr [ebp+8H] ; 0003 _ 8B. 45, 08
sar eax, 16 ; 0006 _ C1. F8, 10
and al, 0FFFFFFFFH ; 0009 _ 24, FF
pop ebp ; 000B _ 5D
ret ; 000C _ C3
@foobar$qi ENDP
As you can see, the redundant and
is still there. The 32-bit immediate in the disassembly can be disregarded, since the instruction's encoding clearly shows that the immediate in the actual code stream is 8-bit : there are no other valid options with an 8-bit register anyway.
Microsoft Visual Studio C++ 6 seems to be guilty of the same thing, but operates on the whole 32-bit register (thus generating 3 bytes more due to the 32-bit immediate), clearing the upper bits - which is needless, seeing how the return value of the function was explicitly declared to be 8-bit :
?foobar@@YAEH@Z PROC NEAR ; foobar
; 1 : unsigned char foobar(int foo) { return (foo >> 16) & 0xff; }
00000 55 push ebp
00001 8b ec mov ebp, esp
00003 8b 45 08 mov eax, DWORD PTR _foo$[ebp]
00006 c1 f8 10 sar eax, 16 ; 00000010H
00009 25 ff 00 00 00 and eax, 255 ; 000000ffH
0000e 5d pop ebp
0000f c3 ret 0
?foobar@@YAEH@Z ENDP ; foobar
Meanwhile, the oldest version of gcc available on godbolt correctly compiles this into what's essentially just a shift, except the natural differences between the listings due to calling conventions.
Upvotes: 4