Kevin Catino
Kevin Catino

Reputation: 41

Why does initializing an array with a one and zeros make the executable file so big?

If I compile the following program int array[5000]={0}; int main(){}, the output file size is much smaller than if I do int array[5000]={1}; int main(){}, which initializes the first element with a one and the rest with zeros, so why is there such a big difference on the file size?

Upvotes: 4

Views: 386

Answers (4)

H.S.
H.S.

Reputation: 12679

From .bss [BSS in C]

An implementation may also assign statically-allocated variables and constants initialized with a value consisting solely of zero-valued bits to the BSS section.

The size that BSS will require at runtime is recorded in the object file, but BSS (unlike the data segment) doesn't take up any actual space in the object file.

For program int array[5000]={0}; int main(){}

data and bss size:

# size a.out 
   text    data     bss     dec     hex filename
   1040     484   20032   21556    5434 a.out

executable size:

# ls -l a.out
-rwxr-xr-x. 1 root root 6338 Sep  7 17:05 a.out

For program int array[5000]={1}; int main(){}

data and bss size:

# size a.out
   text    data     bss     dec     hex filename
   1040   20512      16   21568    5440 a.out

executable size:

# ls -l a.out
-rwxr-xr-x. 1 root root 26362 Sep  7 17:24 a.out

The output shown above is from Linux platform.

Upvotes: 0

Brendan
Brendan

Reputation: 37222

so why is there such a big difference on the file size?

Essentially, it's because the compiler/linker/executable loader aren't good at optimizing.

If a statically allocated array is full of zeros (or uninitialized) the compiler puts it in a special section (".bss") with everything else that's zeros (or uninitialized); and because the program loader knows the entire section is full of zeros none of the data is stored in the file itself.

If a statically allocated array isn't full of zeros; then the compiler puts it in a different section (".data") and all of the data gets included in the file (even when it's "almost but not quite full of zeros").

Ideally; the compiler/tools would be able to detect simple cases (e.g. an array that is initialized with one non-zero value that is almost but not quite full of zeros) and put the array in the ".bss" so it costs nothing, but then generate a small amount of start-up code to correct it (e.g. set the first element in the array) before any of your code executes.

As a work-around, (if the array isn't read-only) you could do the same optimization yourself (leave the array full of zeros, and put an array[0] = 1; at the start of your main()).

Upvotes: 0

Luis Colorado
Luis Colorado

Reputation: 12698

When you don't initialize a global (or static) variable, it get's allocated in an output segment that is called .bss which is all zeros and so, it doesn't need the details to be written in the output file. If you put a single bit different than zero, the variable has to go into the initialized data segment (.data) which is written to the output file, as its contents must be detailed. This means that, even if you explicitly initialize it to zeros, the compiler realizes that the initialization coincides with the one of an uninitialized variable and stores the array in the .bss segment too, avoiding the grow in the final file.

For the .data segment, all of its contents is saved on the executable file, while for the .bss segment, only its size is stored, as the kernel can allocate a zero filled segment for it when it it loaded into memory.

In unix systems, the data segment initialization is made by checking the full size of the data segments (.data plus .bss) but only the .data segment is copied to the segment at loading time. The rest is allways filled by the kernel with zeros, by default. This accelerates the process of loading the code into memory for the kernel and makes the executable smaller.

Upvotes: 0

CiaPan
CiaPan

Reputation: 9570

Your array is a static global variable.

If it is declared as initialized with zeros only, it can be allocated in a special segment of memory, which is created during the process startup and initialized with zeros.

OTOH if it is declared as containing anythig non-zero, its initial value must be stored inside the program's file, so that when the operating system prepares the program in memory for being run, it can allocate appropriate segment of data and fill it with defined initial values.

See https://en.wikipedia.org/wiki/Data_segment for DATA and BSS segments.

Upvotes: 3

Related Questions