Zero initialized segments/sections in ELF files

Question

I'm interested in knowing what the conventions are for zero initialized segments/sections in ELF files.

Typically it is written that the .bss section of an ELF file should be zero initialized prior to use.

I'm wondering if the convention/specification specifies if said zeroing needs to be done by the ELF loader or if it should be done by the ELF instructions themselves.

For the latter case, I believe, this would be some instructions/functions between the ELF entry point and the main function of the program...

Thanks

Employed Russian · Accepted Answer

I'm wondering if the convention/specification specifies if said zeroing needs to be done by the ELF loader

Yes, indirectly.

or if it should be done by the ELF instructions themselves.

There is no such thing as "ELF instructions". The ELF format (for executable or shared library) tells the loader how to load and start the ELF object. But all of parsing and setup that happens in the loader.

So how does zero-ing actually happen?

int foo[40960];
int main() { return 0; }

gcc t.c -no-pie

readelf -Wl a.out

Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x000400 0x000400 R   0x1000
  LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x00017d 0x00017d R E 0x1000
  LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R   0x1000
  LOAD           0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001d8 0x028210 RW  0x1000
  DYNAMIC        0x002e60 0x0000000000403e60 0x0000000000403e60 0x000190 0x000190 RW  0x8
  NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000044 0x000044 R   0x4
  GNU_EH_FRAME   0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001b0 0x0001b0 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
   03     .init .text .fini
   04     .rodata .eh_frame_hdr .eh_frame
   05     .init_array .fini_array .dynamic .got .got.plt .data .bss
   06     .dynamic
   07     .note.gnu.build-id .note.ABI-tag
   08     .eh_frame_hdr
   09
   10     .init_array .fini_array .dynamic .got

Here you can see that

.bss is in 5th segment.
That LOAD segment has a tiny .p_filesz == 0x1d8, but a pretty large .p_memsz == 0x28210.

The loader the performs mmap(...) with file offset 0x2e50 rounded down to page size (i.e. 0x2000) and size of 0x028210 + 0xe50 == 0x29060, which is larger than the file (the file is only 0x3d20 bytes in my case).

It is the mmap system call which provides zero-filled pages to any mapping which extends past the end of the file, so the actual zero-ing instructions are part of the kernel, not the loader.

P.S. You can observe this by looking at readelf output on your system and comparing it to the output from strace ./a.out.

Zero initialized segments/sections in ELF files

Answers (1)

Related Questions