Reputation: 1382
For the code:
uint8_t count;
ISR(TIMER1_OVF_vect, ISR_NAKED)
{
count++;
reti();
}
The generated assembly is:
--- F:\atmel-prj\compiler-test2\compiler-test2\Debug/.././compiler-test2.c -----
{
00000048 PUSH R1 Push register on stack
00000049 PUSH R0 Push register on stack
0000004A IN R0,0x3F In from I/O location
0000004B PUSH R0 Push register on stack
0000004C CLR R1 Clear Register
0000004D PUSH R24 Push register on stack
count++;
0000004E LDS R24,0x0100 Load direct from data space
00000050 SUBI R24,0xFF Subtract immediate
00000051 STS 0x0100,R24 Store direct to data space
}
00000053 POP R24 Pop register from stack
00000054 POP R0 Pop register from stack
00000055 OUT 0x3F,R0 Out to I/O location
00000056 POP R0 Pop register from stack
00000057 POP R1 Pop register from stack
00000058 RETI Interrupt return
Now, as I see it, at least one, and up to two push-pop pairs could be eliminated:
to save one push-pop, eliminate r24 by using r0 instead
PUSH R1 Push register on stack
PUSH R0 Push register on stack
IN R0,0x3F In from I/O location
PUSH R0 Push register on stack
CLR R1 Clear Register
count++;
LDS R0,0x0100 Load direct from data space
SUBI R0,0xFF Subtract immediate
STS 0x0100,R0 Store direct to data space
...
See that no code is using r1 for 0 value, so use only r1 for all purposes.
PUSH R1 Push register on stack
IN R1,0x3F In from I/O location
PUSH R1 Push register on stack
CLR R1 Clear Register
count++;
LDS R0,0x0100 Load direct from data space
SUBI R0,0xFF Subtract immediate
STS 0x0100,R0 Store direct to data space
...
Either of these save us precious bytes and microseconds.
Is there a way that I can put these or similar optimizations into the atmel studio toolchain/libraries somehow, so that my compiled code gets generated slightly better?
A lot of code surrounding interrupt and function calls, and some C to Assembly translations could be optimized a lot.
Upvotes: 0
Views: 406
Reputation: 323
I also found out that generated assembly code could be optimized. I was not using -On
option by that time.
I found that the code of a called function (which takes a parameter in a register) ; copies it to the stack (to keep a clean copy as an 'automatic' C variable... which is in most case useless) AND FURTHERMORE copied back this value back to originating register !!! Whereas this register was just read before, and GCC should know it.
If you use Atmel Studio with default compiler, which is GCC, your binary code optimisation is depending on the toolchain behaviour, (and therefore, options specified on its command line).
To obtain a finest result, you can try to compile your source tree using avr-gcc directly (to setup options that Atmel Studio can't set for you).
Using another toolchain is another way to gain (theorically) finest results. Anyway, i doubt any other toolchain will give an (overall) better result than GCC does.
Upvotes: 1