Maxpm
Maxpm

Reputation: 25592

Why do forward reference ADR instructions assemble with even offsets in Thumb code?

To bx to a Thumb function, the least significant bit of the address needs to be set. The GNU as documentation states how this works when the address is generated from an adr pseudo-instruction:

adr <register> <label>

This instruction will load the address of label into the indicated register. [...]

If label is a thumb function symbol, and thumb interworking has been enabled via the -mthumb-interwork option then the bottom bit of the value stored into register will be set. This allows the following sequence to work as expected:

adr r0, thumb_function

blx r0

So it sounds like things should just work. However, looking at some disassembly, it seems like certain addresses do not have that bottom bit set.

For example, assembling and linking:

.syntax unified
.thumb

.align 2
table:
    .4byte f1
    .4byte f2
    .4byte f3

.align 2
.type f1, %function
.thumb_func
f1:
    adr r1, f1
    adr r2, f2
    adr r3, f3
    bx r1

.align 2
.type f2, %function
.thumb_func
f2:
    adr r1, f1
    adr r2, f2
    adr r3, f3
    bx r2

.align 2
.type f3, %function
.thumb_func
f3:
    adr r1, f1
    adr r2, f2
    adr r3, f3
    bx r3

With:

arm-none-eabi-as adr_test.s -mthumb -mthumb-interwork -o adr_test.o
arm-none-eabi-ld adr_test.o

And checking with arm-none-eabi-objdump -D a.out, I get:

00008000 <table>:
    8000:   0000800d    .word   0x0000800d
    8004:   00008019    .word   0x00008019
    8008:   00008025    .word   0x00008025

0000800c <f1>:
    800c:   f2af 0103   subw    r1, pc, #3
    8010:   a201        add r2, pc, #4  ; (adr r2, 8018 <f2>)
    8012:   a304        add r3, pc, #16 ; (adr r3, 8024 <f3>)
    8014:   4708        bx  r1
    8016:   46c0        nop         ; (mov r8, r8)

00008018 <f2>:
    8018:   f2af 010f   subw    r1, pc, #15
    801c:   f2af 0207   subw    r2, pc, #7
    8020:   a300        add r3, pc, #0  ; (adr r3, 8024 <f3>)
    8022:   4710        bx  r2

00008024 <f3>:
    8024:   f2af 011b   subw    r1, pc, #27
    8028:   f2af 0213   subw    r2, pc, #19
    802c:   f2af 030b   subw    r3, pc, #11
    8030:   4718        bx  r3
    8032:   46c0        nop         ; (mov r8, r8)

There are a few things to note:

  1. In table, the absolute addresses of f1, f2, and f3 are all odd, as expected. So, clearly, the assembler and linker know that those three functions should be Thumb.
  2. For backward references, where the adr pseudo-instruction assembles down to a subw, the offset is odd, as expected.
  3. But for forward references, where the adr pseudo-instruction assembles to an add, the offset is even.

What am I missing?

Upvotes: 4

Views: 1156

Answers (3)

old_timer
old_timer

Reputation: 71536

Coming back to this question. The bug is truly this simple:

  if (inst.relocs[0].exp.X_op == O_symbol
      && inst.relocs[0].exp.X_add_symbol != NULL
      && S_IS_DEFINED (inst.relocs[0].exp.X_add_symbol)
      && THUMB_IS_FUNC (inst.relocs[0].exp.X_add_symbol))
    inst.relocs[0].exp.X_add_number += 1;

in the do_t_adr() function.

S_IS_DEFINED does a check to see if the symbol is defined, when doing a forward reference at this point in time the symbol is not defined, so that line does not pass, it does not add one which is very disturbing for cleanliness it should ORR one, but whatever. For backwards reference the symbol is defined so the adjustment is made. (Naturally the THUMB_IS_FUNC won't work either without a defined symbol)

The ADR is converted into a BFD_RELOC_ARM_THUMB_ADD. Which takes us here:

case BFD_RELOC_ARM_THUMB_ADD:
  /* This is a complicated relocation, since we use it for all of
 the following immediate relocations:

    3bit ADD/SUB
    8bit ADD/SUB
    9bit ADD/SUB SP word-aligned
   10bit ADD PC/SP word-aligned

 The type of instruction being processed is encoded in the
 instruction field:

   0x8000  SUB
   0x00F0  Rd
   0x000F  Rs
  */

and within that here:

else if (rs == REG_PC || rs == REG_SP)
  {
    /* PR gas/18541.  If the addition is for a defined symbol
       within range of an ADR instruction then accept it.  */

And that code which happens on a later pass (after the symbol has been defined and can be found) does not patch up the immediate/offset.

I find it even more disturbing/buggy that it can't handle this without .syntax unified.

.thumb
.thumb_func
zero:
    adr r0,zero

Even with .syntax unified they didn't finish implementing ADR for T16. Just put an error in there and called it done. (It can certainly be implemented in T16 add rx,pc,#0, sub rx,#offset for example.)

Even if they fixed it I would avoid the ADR instruction. But it is clear they didn't bother to actually finish implementing this pseudo instruction.

Note in arm mode they have the same bug, checking for the symbol at the wrong time.

  if (support_interwork
      && inst.relocs[0].exp.X_op == O_symbol
      && inst.relocs[0].exp.X_add_symbol != NULL
      && S_IS_DEFINED (inst.relocs[0].exp.X_add_symbol)
      && THUMB_IS_FUNC (inst.relocs[0].exp.X_add_symbol))
    inst.relocs[0].exp.X_add_number |= 1;

Note the ORR not ADD of one, better/different author, but didn't quite think this solution through.

If I remove the S_IS_DEFINED and THUMB_IS_FUNC checks:

.arm
zero:
    adr r0,two
.thumb
.thumb_func
two:
    nop

goes from:

00000000 <zero>:
   0:   e24f0004    sub r0, pc, #4

00000004 <two>:
   4:   46c0        nop         ; (mov r8, r8)
   6:   46c0        nop         ; (mov r8, r8)

to:

00000000 <zero>:
   0:   e24f0003    sub r0, pc, #3

00000004 <two>:
   4:   46c0        nop         ; (mov r8, r8)
   6:   46c0        nop         ; (mov r8, r8)

Likewise:

.syntax unified

.thumb
    adr r0,two
    nop
    nop
.thumb_func
two:
    nop

gives:

00000000 <two-0x8>:
   0:   f20f 0005   addw    r0, pc, #5
   4:   46c0        nop         ; (mov r8, r8)
   6:   46c0        nop         ; (mov r8, r8)

00000008 <two>:
   8:   46c0        nop         ; (mov r8, r8)

Note this could have been easily implemented using T16 instructions (uses 4 bytes just like the T32 solution), but that is as mentioned yet another bug:

.syntax unified
.cpu cortex-m0
.thumb
    adr r0,two
    nop
    nop
.thumb_func
two:
    nop

/path/so.s: Assembler messages:
/path/so.s:5: Error: invalid immediate for address calculation (value = 0x00000003)

(and that bug is in the same section of code that has this bug you pointed out)

It would be interesting to see first what the documentation for other assemblers says with respect to ADR and thumb, and second if they actually implement it per that documentation and/or bail out with an error or warning.

Upvotes: 1

Maxpm
Maxpm

Reputation: 25592

This was a bug in the GNU Assembler (gas). It should be fixed in v2.37.

Upvotes: 2

Ross Ridge
Ross Ridge

Reputation: 39591

What you're missing is this line from the ARM documentation for the ADR pseudo-instruction:

If you use ADR to generate a target for a BX or BLX instruction, it is your responsibility to set the Thumb bit (bit 0) of the address if the target contains Thumb instructions.

The forward referencing ADR instructions use the 16-bit Thumb "ADD Rd, pc, #imm" form of the ADD instruction. The immediate for this instruction is in the range of 0-1020 and must be word aligned (ie. its encoded with an 8-bit field and multiplied by 4.) The PC value used also has the lower two bits set to 0, so it is incapable of generating an odd address.

Forcing the assembler to always use a 32-bit Thumb instruction with ADR.W should cause it to always generate an odd address when a function label is used, but I don't know if you can depend on this. It would probably be better to just to set the lower bit explicitly.

Upvotes: 3

Related Questions