grep
grep

Reputation: 4026

gcc: strange asm generated for simple loop

m68k-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CFLAGS = -Wall -Werror -ffreestanding -nostdlib -O2 -m68000 -mshort

I am very confused why gcc generates such (seemingly) non-optimal code for a simple for loop over a const array.

const unsigned int pallet[16] = {
  0x0000,
  0x00E0,
  0x000E,
  ...
  0x0000
};

...

volatile unsigned long * const VDP_DATA = (unsigned long *) 0x00C00000;

...

for(int i = 0; i < 16; i++) {
  *VDP_DATA = pallet[i];
}

Results in:

 296:   41f9 0000 037e  lea 37e <pallet+0x2>,%a0
 29c:   223c 0000 039c  movel #924,%d1
 2a2:   4240            clrw %d0
 2a4:   0280 0000 ffff  andil #65535,%d0
 2aa:   23c0 00c0 0000  movel %d0,c00000 <_etext+0xbffc2c>
 2b0:   b288            cmpl %a0,%d1
 2b2:   6712            beqs 2c6 <main+0x46>
 2b4:   3018            movew %a0@+,%d0
 2b6:   0280 0000 ffff  andil #65535,%d0
 2bc:   23c0 00c0 0000  movel %d0,c00000 <_etext+0xbffc2c>
 2c2:   b288            cmpl %a0,%d1
 2c4:   66ee            bnes 2b4 <main+0x34>

My main concern:

Why the useless first element compare at 2b0? This will never hit and never gets branched back to. It just ends up being duplicate code all for the first iteration.

lea pallet,%a0
movel #7,%d0
1:
movel %a0@+,c00000
dbra %d0,1

I get that I have to be a bit more explicit in my code to get it to write in long chunks. My main point here is how come gcc can't seem to figure out the my intentions i.e I just want to dump this data in to this address.

Another observation:

clrw %d0andil #65535,%d0movel %d0,c00000. Why not just clrl and move?

Upvotes: 3

Views: 349

Answers (1)

lvd
lvd

Reputation: 812

I've been playing with GCC and 68k code generation and I've found that it merely can't generate decent code for 68k family any more, particularly not for 68000.

The code is barely correct, however not optimized (or should I say, it seems to be DE-optimized?). You should first try to use -Os instead of -O2. Even then you'll encounter lots of useless insns in the generated code.

My speculation is that while the actual architectures support in GCC quickly moves forward, backend for 68k is not properly maintained, being simply kept correct with minimal effort.

Upvotes: 1

Related Questions