user17066118
user17066118

Reputation:

Linking with gcc increases file size to 16 KB

I'm on linux right now. I'm compiling a super simple C program:

#include <stdio.h>
int main()
{
    printf("Hello, world!\n");
    return 0;
}

and compiling with gcc main.c -o main

After running ll to get file sizes, this is what it returns:

-rwxr-xr-x 1 xylight xylight  16K Nov  5 11:30 main
-rw-r--r-- 1 xylight xylight   68 Nov  5 11:23 main.c
-rw-r--r-- 1 xylight xylight 1.5K Nov  5 11:30 main.o

After linking main.o, the file size becomes 16KB! How can I make this smaller? Any linker options?

I'm not sure if this is a duplicate, I couldn't find anything on here. Let me know if it's a duplicate.

After running readelf -h main it says that the ELF type is this: DYN (Shared object file)

Anyone know what I could do to make this smaller?

Upvotes: 0

Views: 540

Answers (1)

yugr
yugr

Reputation: 21999

I wasn't able to find a good explanation of this on SO so let me post one here.

First of all, by default executable includes static symbol table which is used for debugging and is not loaded at runtime. We can get rid of it with strip main which will save us about 2K (down to 15K on my Ubuntu 20).

Now, we can take a closer look at the overheads by running readelf -SW main:

  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        0000000000000318 000318 00001c 00   A  0   0  1
  [ 2] .note.gnu.property NOTE            0000000000000338 000338 000020 00   A  0   0  8
  [ 3] .note.gnu.build-id NOTE            0000000000000358 000358 000024 00   A  0   0  4
  [ 4] .note.ABI-tag     NOTE            000000000000037c 00037c 000020 00   A  0   0  4
  [ 5] .gnu.hash         GNU_HASH        00000000000003a0 0003a0 000024 00   A  6   0  8
  [ 6] .dynsym           DYNSYM          00000000000003c8 0003c8 0000a8 18   A  7   1  8
  [ 7] .dynstr           STRTAB          0000000000000470 000470 000082 00   A  0   0  1
  [ 8] .gnu.version      VERSYM          00000000000004f2 0004f2 00000e 02   A  6   0  2
  [ 9] .gnu.version_r    VERNEED         0000000000000500 000500 000020 00   A  7   1  8
  [10] .rela.dyn         RELA            0000000000000520 000520 0000c0 18   A  6   0  8
  [11] .rela.plt         RELA            00000000000005e0 0005e0 000018 18  AI  6  24  8
  [12] .init             PROGBITS        0000000000001000 001000 00001b 00  AX  0   0  4
  [13] .plt              PROGBITS        0000000000001020 001020 000020 10  AX  0   0 16
  [14] .plt.got          PROGBITS        0000000000001040 001040 000010 10  AX  0   0 16
  [15] .plt.sec          PROGBITS        0000000000001050 001050 000010 10  AX  0   0 16
  [16] .text             PROGBITS        0000000000001060 001060 000185 00  AX  0   0 16
  [17] .fini             PROGBITS        00000000000011e8 0011e8 00000d 00  AX  0   0  4
  [18] .rodata           PROGBITS        0000000000002000 002000 000012 00   A  0   0  4
  [19] .eh_frame_hdr     PROGBITS        0000000000002014 002014 000044 00   A  0   0  4
  [20] .eh_frame         PROGBITS        0000000000002058 002058 000108 00   A  0   0  8
  [21] .init_array       INIT_ARRAY      0000000000003db8 002db8 000008 08  WA  0   0  8
  [22] .fini_array       FINI_ARRAY      0000000000003dc0 002dc0 000008 08  WA  0   0  8
  [23] .dynamic          DYNAMIC         0000000000003dc8 002dc8 0001f0 10  WA  7   0  8
  [24] .got              PROGBITS        0000000000003fb8 002fb8 000048 08  WA  0   0  8
  [25] .data             PROGBITS        0000000000004000 003000 000010 00  WA  0   0  8
  [26] .bss              NOBITS          0000000000004010 003010 000008 00  WA  0   0  1
  [27] .comment          PROGBITS        0000000000000000 003010 00002a 01  MS  0   0  1
  [28] .shstrtab         STRTAB          0000000000000000 00303a 00010a 00      0   0  1

As you can see the first 1.5K (up to, but not including, the .init) hold ELF header and bookkeeping data for loading shared libraries (all those .gnu.hash, .dynsym, etc.).

This is followed by 500 bytes of code (.init, .plt, etc. up to, but not including, the .rodata). Note that code allocation starts at 4K page boundary so we waste 2.5K for padding.

Then 3.5K is wasted to realign code at 4K page boundary before ~1K of data sections (.rodata, unwinding tables, etc.). There is an interesting waste of 3K between .eh_frame and .init_array which happens due to some weird alignment between readonly and normal data (see this question for more details).

So to summarize, only a small fraction of ELF size (1.5+0.5+1=3K i.e. 20%) is really used and the rest is wasted to properly align addresses when ELF is mmaped to memory. Address alignment is needed so that dynamic loader could assign different permissions for memory pages (e.g. code pages can not be written but can be executed and for data pages permissions are reversed).

Upvotes: 1

Related Questions