Edward Chamberlain
Edward Chamberlain

Reputation: 302

Understanding ELF Binary Size for nostdlib C Program

I'm on Ubuntu 20.04, gcc 9.3.0, ld 2.34. I have a simple hello world program that does not use glibc or any other library and just uses write syscall. Despite this, my binary size is roughly 8Kb. I'm unsure as to why it is that large and not say 1Kb.

C Program:

int
x64_syscall_write(int fd, char const *data, unsigned long int data_size)
{
  int result = 0;
  __asm__ __volatile__("syscall"
              : "=a" (result)
              : "a" (1), "D" (fd),
                "S" (data), "d" (data_size)
              : "r11", "rcx", "memory");
  return result;
}

__asm__(".global entry_point\n"
  "entry_point:\n"
  "xor rbp, rbp\n"
  "pop rdi\n"
  "mov rsi, rsp\n"
  "and rsp, 0xfffffffffffffff0\n"
  "call main\n"
  "mov rdi, rax\n"
  "mov rax, 60\n"
  "syscall\n"
  "ret");

int
main(int argc, char *argv[])
{
  x64_syscall_write(1, "hello\n", 6); 
  return 0;
}

Built with:

gcc -ffreestanding -static -nostdlib -no-pie -masm=intel \
-fno-unwind-tables -fno-asynchronous-unwind-tables \
-Wl,--gc-sections -fdata-sections -Os \
hello.c -c -o hello.o

# NOTE: I know more could be done here to shave 
# off a few more bytes, but I feel this is the bulk of it.

ld -e entry_point hello.o -o hello

hello.o is 1.7Kb. hello is 8.4Kb.

Upvotes: 0

Views: 671

Answers (1)

Employed Russian
Employed Russian

Reputation: 213799

readelf -Wl hello

Elf file type is EXEC (Executable file)
Entry point 0x40101c
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0001b0 0x0001b0 R   0x1000
  LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x000045 0x000045 R E 0x1000
  LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000007 0x000007 R   0x1000
  NOTE           0x000190 0x0000000000400190 0x0000000000400190 0x000020 0x000020 R   0x8
  GNU_PROPERTY   0x000190 0x0000000000400190 0x0000000000400190 0x000020 0x000020 R   0x8
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.property
   01     .text
   02     .rodata
   03     .note.gnu.property
   04     .note.gnu.property
   05

Here you can see that the linker created 3 LOAD segments: one for the ELF header and other metadata, one for .text and one for .rodata.

Linking with -z noseparate-code results in much smaller binary (smaller than hello.o):

 ls -l hello*
-rwxr-xr-x 1 user user 1384 Apr 26 22:24 hello
-rw-r--r-- 1 user user  603 Apr 26 22:22 hello.c
-rw-r--r-- 1 user user 1680 Apr 26 22:22 hello.o

readelf -Wl hello

Elf file type is EXEC (Executable file)
Entry point 0x40015c
There are 4 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x00018c 0x00018c R E 0x1000
  NOTE           0x000120 0x0000000000400120 0x0000000000400120 0x000020 0x000020 R   0x8
  GNU_PROPERTY   0x000120 0x0000000000400120 0x0000000000400120 0x000020 0x000020 R   0x8
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.property .text .rodata
   01     .note.gnu.property
   02     .note.gnu.property
   03

You can shrink this further by removing .note.GNU-stack and .note.gnu.property sections:

objcopy -R .note.* hello.o hello1.o
ld -e entry_point hello1.o -o hello1 -z noseparate-code

ls -l hello1*
-rwxr-xr-x 1 user user 1072 Apr 26 22:38 hello1
-rw-r--r-- 1 user user 1440 Apr 26 22:37 hello1.o

readelf -Wl hello1

Elf file type is EXEC (Executable file)
Entry point 0x400094
There is 1 program header, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0000c4 0x0000c4 R E 0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text .rodata

Upvotes: 4

Related Questions