kpdev
kpdev

Reputation: 610

Reduce clang-generated code size for ARM

I compare code generated by clang and generated by gcc for arm.

Unfortunately, gcc's code more often has less instructions. I am just curious - is it possible to reduce code, generated by clang? Maybe I should use some options to do so...

Please, consider very simple example:

> cat test.c
int to_upper(int c)  
{  
   if(c < 'a' || c > 'z') return c; 
   else return c - ('a' - 'A');  
}

> clang -target arm-none-eabi -Oz -c -mthumb -mcpu=cortex-m0 -msoft-float ./test.c -o ./clang_test.o 
> /usr/bin/arm-none-eabi-gcc -Os -c -mthumb -mcpu=cortex-m0 -msoft-float ./test.c -o ./gcc_test.o 

> /usr/bin/arm-none-eabi-objdump -d ./clang_test.o 
./clang_test.o:     file format elf32-littlearm 
Disassembly of section .text: 
00000000 <to_upper>: 
   0:   4602        mov r2, r0 
   2:   3a61        subs    r2, #97 ; 0x61 
   4:   4601        mov r1, r0 
   6:   3920        subs    r1, #32 
   8:   2a19        cmp r2, #25 
   a:   d800        bhi.n   e <to_upper+0xe>
   c:   4608        mov r0, r1 
   e:   4770        bx  lr 

> /usr/bin/arm-none-eabi-objdump -d ./gcc_test.o 
./gcc_test.o:     file format elf32-littlearm 
Disassembly of section .text: 
00000000 <to_upper>: 
   0:   1c03        adds    r3, r0, #0 
   2:   3b61        subs    r3, #97 ; 0x61 
   4:   2b19        cmp r3, #25 
   6:   d800        bhi.n   a <to_upper+0xa>
   8:   3820        subs    r0, #32 
   a:   4770        bx  lr 

Why so much difference in such simple code? Can clang generate less code in this case? At least as gcc?

Note: if we change cpu to -mcpu=cortex-a5 (other options remains the same), then clang ang gcc produce absolutely identical code:

00000000 <to_upper>: 
   0:   f1a0 0361 sub.w r3, r0, #97 ; 0x61 
   4:   2b19        cmp r3, #25 
   6:   bf98        it  ls 
   8:   3820        subls   r0, #32 
   a:   4770        bx  lr 

OS: Ubuntu 14.04.3

clang version 3.7.1 (tags/RELEASE_371/final) Target: x86_64-unknown-linux-gnu Thread model: posix

arm-none-eabi-gcc (4.8.2-14ubuntu1+6) 4.8.2

Upvotes: 2

Views: 1228

Answers (1)

kpdev
kpdev

Reputation: 610

No, clang cannot generate less code in this case. And also in many others.

Historically, very few code size optimizations have been implemented in LLVM. When optimizing for code size, GCC typically outperforms LLVM significantly.

Here presentation, where done a closer look at the comparing GCC and Clang in terms of code size optimization.

Presentation video

Upvotes: 1

Related Questions