izac89
izac89

Reputation: 3940

Understanding certain ELF file structure

From ARM's infocenter, regarding section static linking and relocations:

** Section #1 'ER_RO' (SHT_PROGBITS) [SHF_ALLOC + SHF_EXECINSTR]
Size : 28 bytes (alignment 4)
Address: 0x00008000
$a
.text
bar
    0x00008000: E59f000C .... LDR r0,[pc,#12] ; [0x8014] = 0x801C
    0x00008004: E5901000 .... LDR r1,[r0,#0]
    0x00008008: E2411001 ..A. SUB r1,r1,#1
    0x0000800C: E5801000 .... STR r1,[r0,#0]
    0x00008010: E12FFF1E ../. BX lr
$d
    0x00008014: 0000801C .... DCD 32796
$a
.text
foo
    0x00008018: EAFFFFF8 .... B bar ; 0x8000

and from ELF for the ARM architecture:

Table 4-7, Mapping symbols
Name Meaning
$a - Start of a sequence of ARM instructions
$d - Start of a sequence of data items (for example, a literal pool)

As you can see, the ELF file contains a section in which there is code (bar), then data/ro (32796), then more code (foo) in consecutive addresses.

Now, a basic principle regarding any SW file structure is that the SW is composed from different and separate sections - text (code), data, and bss. (and rodata if we want to be pedantic) as we can see if we examine the MAP file.

So, this ELF structure is not consistent with this basic principle, so my question is what is going on here? am I mistaking in this basic principle? if not, than is this ELF structure will be changed in run time to meet the sections separation? and why is the ELF section contains mixed types in a certain sequential address space?

NOTE: I assume the scatter file used in the example is the default one since the document contains the example do not provide any scatter file along with the example.

Upvotes: 0

Views: 565

Answers (1)

Florian Weimer
Florian Weimer

Reputation: 33747

At run time, the sections do not matter, only the PT_LOAD segments in the program header. The ELF specification is quite flexible there as well, but some loaders have restrictions on the PT_LOAD segments they can process.

The reason for splitting code and data this way could be that this architecture supports only a limited range of PC-relative addressing and needs a constant pool for loading most constants (because constructing them via immediates is too expensive). Having as few large constants pools as possible is attractive because it leads to improved data and instruction cache utilization (instead of caching memory which is not of the right type and this can never be used), but you may still need more than one if the code size exceeds what can be addressed directly.

Upvotes: 2

Related Questions