Reputation: 5044
I compiled the following function in VC++2010 to see what optimized (/O2) 32bit code was generated (note the unsigned long long type for switch statement) :
int proc1(unsigned long long a, char*s) {
switch (a ) {
case -1L :
printf("-1");
break;
case -11 :
printf("-11");
break;
case -3 :
printf("-3");
break;
case 21 :
printf("21");
break;
case 25 :
printf("25");
break;
case 29 :
printf("29");
break;
case 31 :
printf("31");
break;
case 40 :
printf("40");
break;
}
return strlen(s);
}
Here is part of the generated code:
_TEXT SEGMENT
_a$ = 8 ; size = 8
_s$ = 16 ; size = 4
?proc1@@YAH_KPAD@Z PROC ; proc1, COMDAT
; 5 : int proc1(unsigned long long a, char*s) {
push ebp
mov ebp, esp
; 6 :
; 7 : switch (a ) {
mov ecx, DWORD PTR _a$[ebp+4]
mov eax, DWORD PTR _a$[ebp]
test ecx, ecx
ja SHORT $LN13@proc1
jb SHORT $LN15@proc1
cmp eax, 40 ; 00000028H
ja SHORT $LN13@proc1
$LN15@proc1:
cmp eax, 40 ; 00000028H
jne SHORT $LN16@proc1
test ecx, ecx
je SHORT $LN1@proc1
$LN16@proc1:
test ecx, ecx
ja SHORT $LN14@proc1
jb SHORT $LN17@proc1
cmp eax, 29 ; 0000001dH
ja SHORT $LN14@proc1
But why it uses ja + jb after the test ecx, ecx? i.e.:
test ecx, ecx
ja SHORT $LN13@proc1
jb SHORT $LN15@proc1
From what I understand, the test ecx,ecx
is to test if the ecx is zero. Hence, I expect a je/jz/jnz
after the instruction. ja
should test zf=0/cf=1 while jb
should test zf=0/cf=0. So it is not appropriate to use them after the test
instruction. And it even makes less sense to me to have a jb
following the ja
since both expects zf=0 and cf is not set properly before them.
Upvotes: 0
Views: 89
Reputation: 26171
The JA
& JB
pairs are used because you are branching on an unsigned type, and due register spanning being used to hold the 8-byte value, this is testing the MSB (sign bit) for the negative cases then checking the lower DWORD for the positive values.
Inspecting partial assembly is a little fruitless here, one needs to examine the whole function to determine the scope of the optimization. Inspecting the full assembly we see that we get two distinct sections, one for positive case values and one for the negative cases.
Because this case was transformed into a set of chained branches we have a compare at each value for the MSB (to skip to the negative part), then we have a check on the low DWORD to see if its greater than the maximum positive value if so then branch again to the negative portion, finally we have the actual compare against the value for the case.
64TypeTe.p>/$ >PUSH EBP
00C21001 |. >MOV EBP,ESP
00C21003 |. >MOV ECX,DWORD PTR SS:[EBP+C]
00C21006 |. >MOV EAX,DWORD PTR SS:[EBP+8]
00C21009 |. >TEST ECX,ECX
00C2100B |. >JA 64TypeTe.00C210CD
00C21011 |. >JB SHORT 64TypeTe.00C2101C
00C21013 |. >CMP EAX,28
00C21016 |. >JA 64TypeTe.00C210CD
00C2101C |> >CMP EAX,28
00C2101F |. >JNZ SHORT 64TypeTe.00C21029
00C21021 |. >TEST ECX,ECX
00C21023 |. >JE 64TypeTe.00C210B8
00C21029 |> >TEST ECX,ECX
00C2102B |. >JA SHORT 64TypeTe.00C21096
00C2102D |. >JB SHORT 64TypeTe.00C21034
00C2102F |. >CMP EAX,1D
00C21032 |. >JA SHORT 64TypeTe.00C21096
00C21034 |> >CMP EAX,1D
00C21037 |. >JNZ SHORT 64TypeTe.00C2103D
00C21039 |. >TEST ECX,ECX
00C2103B |. >JE SHORT 64TypeTe.00C21081
00C2103D |> >CMP EAX,15
00C21040 |. >JNZ SHORT 64TypeTe.00C21046
00C21042 |. >TEST ECX,ECX
00C21044 |. >JE SHORT 64TypeTe.00C2106C
00C21046 |> >CMP EAX,19
00C21049 |. >JNZ 64TypeTe.00C21120
00C2104F |. >TEST ECX,ECX
00C21051 |. >JNZ 64TypeTe.00C21120
00C21057 |. >PUSH 64TypeTe.??_C@_02IFGANFKF@25?$AA@ ; /format = "25"
00C2105C |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C21062 |. >ADD ESP,4
00C21065 |. >MOV EAX,3
00C2106A |. >POP EBP
00C2106B |. >RETN
00C2106C |> >PUSH 64TypeTe.??_C@_02OBAMBAKB@21?$AA@ ; /format = "21"
00C21071 |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C21077 |. >ADD ESP,4
00C2107A |. >MOV EAX,3
00C2107F |. >POP EBP
00C21080 |. >RETN
00C21081 |> >PUSH 64TypeTe.??_C@_02CJNFJKKJ@29?$AA@ ; /format = "29"
00C21086 |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C2108C |. >ADD ESP,4
00C2108F |. >MOV EAX,3
00C21094 |. >POP EBP
00C21095 |. >RETN
00C21096 |> >CMP EAX,1F
00C21099 |. >JNZ 64TypeTe.00C21120
00C2109F |. >TEST ECX,ECX
00C210A1 |. >JNZ SHORT 64TypeTe.00C21120
00C210A3 |. >PUSH 64TypeTe.??_C@_02OAMOHKJG@31?$AA@ ; /format = "31"
00C210A8 |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C210AE |. >ADD ESP,4
00C210B1 |. >MOV EAX,3
00C210B6 |. >POP EBP
00C210B7 |. >RETN
00C210B8 |> >PUSH 64TypeTe.??_C@_02PMJKFNFC@40?$AA@ ; /format = "40"
00C210BD |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C210C3 |. >ADD ESP,4
00C210C6 |. >MOV EAX,3
00C210CB |. >POP EBP
00C210CC |. >RETN
00C210CD |> >CMP EAX,-0B
00C210D0 |. >JNZ SHORT 64TypeTe.00C210D7
00C210D2 |. >CMP ECX,-1
00C210D5 |. >JE SHORT 64TypeTe.00C21112
00C210D7 |> >CMP EAX,-3
00C210DA |. >JNZ SHORT 64TypeTe.00C210E1
00C210DC |. >CMP ECX,-1
00C210DF |. >JE SHORT 64TypeTe.00C210FD
00C210E1 |> >AND EAX,ECX
00C210E3 |. >CMP EAX,-1
00C210E6 |. >JNZ SHORT 64TypeTe.00C21120
00C210E8 |. >PUSH 64TypeTe.??_C@_02PGHGPEOM@?91?$AA@ ; /format = "-1"
00C210ED |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C210F3 |. >ADD ESP,4
00C210F6 |. >MOV EAX,3
00C210FB |. >POP EBP
00C210FC |. >RETN
00C210FD |> >PUSH 64TypeTe.??_C@_02MEEAJGGO@?93?$AA@ ; /format = "-3"
00C21102 |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C21108 |. >ADD ESP,4
00C2110B |. >MOV EAX,3
00C21110 |. >POP EBP
00C21111 |. >RETN
00C21112 |> >PUSH 64TypeTe.??_C@_03GPBHNPBF@?911?$AA@ ; /format = "-11"
00C21117 |. >CALL DWORD PTR DS:[<&MSVCR100.printf>] ; \printf
00C2111D |. >ADD ESP,4
00C21120 |> >MOV EAX,3
00C21125 |. >POP EBP
00C21126 \. >RETN
I would actually still say this code is "bugged" due to the repeated (superfluous) checks for the MSB and maximum values; this is likely due to an oversight in thee code generation for register spanned values when combined with the chained branches (though its unlikely to be fixed with VS2010 being quite "old").
In this case, use of a long long
seems a little pointless (so does the use of unsigned even though you have signed values in your cases).
Upvotes: 1