Reputation: 23
I'm interested in knowing what the conventions are for zero initialized segments/sections in ELF files.
Typically it is written that the .bss section of an ELF file should be zero initialized prior to use.
I'm wondering if the convention/specification specifies if said zeroing needs to be done by the ELF loader or if it should be done by the ELF instructions themselves.
For the latter case, I believe, this would be some instructions/functions between the ELF entry point and the main function of the program...
Thanks
Upvotes: 2
Views: 864
Reputation: 213955
I'm wondering if the convention/specification specifies if said zeroing needs to be done by the ELF loader
Yes, indirectly.
or if it should be done by the ELF instructions themselves.
There is no such thing as "ELF instructions". The ELF format (for executable or shared library) tells the loader how to load and start the ELF object. But all of parsing and setup that happens in the loader.
So how does zero-ing actually happen?
int foo[40960];
int main() { return 0; }
gcc t.c -no-pie
readelf -Wl a.out
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000400 0x000400 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00017d 0x00017d R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R 0x1000
LOAD 0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001d8 0x028210 RW 0x1000
DYNAMIC 0x002e60 0x0000000000403e60 0x0000000000403e60 0x000190 0x000190 RW 0x8
NOTE 0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001b0 0x0001b0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
03 .init .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.gnu.build-id .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
Here you can see that
.bss
is in 5th segment.LOAD
segment has a tiny .p_filesz == 0x1d8
, but a pretty large .p_memsz == 0x28210
.The loader the performs mmap(...)
with file offset 0x2e50
rounded down to page size (i.e. 0x2000
) and size of 0x028210 + 0xe50 == 0x29060
, which is larger than the file (the file is only 0x3d20
bytes in my case).
It is the mmap
system call which provides zero-filled pages to any mapping which extends past the end of the file, so the actual zero-ing instructions are part of the kernel, not the loader.
P.S. You can observe this by looking at readelf
output on your system and comparing it to the output from strace ./a.out
.
Upvotes: 1