Guillaume D
Guillaume D

Reputation: 2326

Could variable zero-initialization reduce performances?

I am correcting static analysis (MISRA-C-2012) violations, one rule of which (rule 9.3) states that variables shall be initialized before use.

For instance:

void bar_read(int * array)
{
    printf("array[1]: %u\n",array[1]);    
}

void bar_write(int * array)
{
    array[1]=1;    
}

int main(void)
{
    #define FOO_SIZE 12
 #ifdef MISRA_VIOLATION_DISABLED
    int foo[FOO_SIZE]  = {0}; //ok
 #else
    int foo[FOO_SIZE]; //violation
 #endif
    bar_read(foo);
    bar_write(foo);
    bar_read(foo); 

    return 0;
}

Some collegues of mine declared that they were removing variables initialization (for big arrays) foo[FOO_SIZE] = {0}; because it was reducing performances, which puzzled me.

In my understanding, zero-initialized variables are put in the bss section at compile time and there is no runtime performance impact.

Could I be wrong? Might it depends on the compiler? Is there any optimisation that makes it true?

Upvotes: 3

Views: 550

Answers (3)

John Bode
John Bode

Reputation: 123448

auto variables are instantiated at runtime, so any initialization also has to occur at runtime, which will incur some performance penalty - exactly how much depends on the compiler and level of optimization.

Having said that, your colleagues should not remove the initialization without doing one of two things:

  • proving that there's no code that will try to read any array element before it is assigned;

  • quantifying the performance loss and showing that it falls outside of some requirement or specification - e.g. "Requirement X says that this operation must complete in 100 ms or less, but with the initialization it's taking 120 ms" or something like that.

EDIT

For example, I changed the code to define the initializer as part of the build command, then I did some crude instrumentation with the clock library function:

#include <stdio.h>
#include <time.h>

void bar_read( int *array )
{
  printf( "array[1]: %d\n", array[1] );
}

void bar_write( int *array )
{
  array[1] = 1;
}

int main( void )
{
  clock_t start = clock();
#ifndef FOO_SIZE
#define FOO_SIZE 2000
#endif

#ifndef INIT 
#define INIT
#endif

  int foo[FOO_SIZE] INIT ; // will expand to nothing or ={0} depending on build command
  bar_read( foo );
  bar_write( foo );
  bar_read( foo );

  clock_t end = clock();
  printf( "operation took %lu clocks (%f seconds)\n", end-start, (double)(end-start)/CLOCKS_PER_SEC );
  return (int)(end-start);
}

So I can build with and without initialization and see if there's a difference in how long a run takes:

$ gcc -o init -std=c11 -pedantic -Wall -Werror -DFOO_SIZE=2000 -DINIT="" init.c
$ ./init
array[1]: -1898976766
array[1]: 1
operation took 39 clocks (0.000039 seconds)

$ gcc -o init -std=c11 -pedantic -Wall -Werror -DFOO_SIZE=2000 -DINIT="={0}" init.c
$ ./init
array[1]: 0
array[1]: 1
operation took 53 clocks (0.000053 seconds)

I have main return the number of clocks taken up by the main part of the program. I then wrote a shell script to build the code with and without the array initializer, run each version a hundred times (bigger sample than we need, but it doesn't take that much time to run) and take the average of those runs (integer average, but good enough for illustration):

#!/bin/bash

INIT_PARAMS=( '""' '"={0}"' )
let runs=100

for INIT in "${INIT_PARAMS[@]}"
do
  cmd="gcc -o init -std=c11 -pedantic -Wall -Werror -DFOO_SIZE=2000 -DINIT=${INIT} init.c"
  echo $cmd
  eval $cmd
  let x=0
  for i in `seq 1 1 $runs`
  do
    ./init >/dev/null # suppress output from init itself
    let x=$x+$?
  done
done

And the output I get is:

$ . init_test.sh 
gcc -o init -std=c11 -pedantic -Wall -Werror -DFOO_SIZE=2000 -DINIT="" init.c
Average clocks per run for INIT="" is 24
gcc -o init -std=c11 -pedantic -Wall -Werror -DFOO_SIZE=2000 -DINIT="={0}" init.c
Average clocks per run for INIT="={0}" is 33

So there is a definite penalty for initializing a 2000-element array of int as part of its declaration, and on average it's 9 clocks (0.000009 seconds), or a 37% increase, without any optimization. Upping the optimization level would reduce that cost (probably), but not eliminate it completely.

Upvotes: 2

dbush
dbush

Reputation: 223699

Variables defined inside of a function without the static keyword have automatic storage duration. These variables are typically created on the stack when they come into scope.

This means that if such variables are initialized then there is a cost at runtime to initialize them.

Only variables with static storage duration, i.e. variables declared at file scope or with the static keyword, are typically defined in either .data if explicitly initialized or .bss if not.

When compiling this code under gcc 4.8.5 with -O0, defining MISRA_VIOLATION_DISABLED resulting in the following additional code:

subq    $48, %rsp
leaq    -48(%rbp), %rsi
movl    $0, %eax
movl    $6, %edx
movq    %rsi, %rdi
movq    %rdx, %rcx
rep stosq

Upvotes: 4

Eric Postpischil
Eric Postpischil

Reputation: 222342

An array defined with int foo[FOO_SIZE] (no static or extern) inside a function has automatic storage duration, meaning it is “created” (memory is reserved for it) each time execution reaches the block it is in and is “destroyed” (memory is released) when execution of that block ends. Because functions can be called recursively, memory for automatic objects cannot feasibly be reserved in the .bss section. The stack is generally used for them.

Further, even if they were in the .bss section, their lifetimes in the C model still begin and end each time the block they are in begins and ends. So, if they are initialized, they have to be initialized each time a new lifetime begins. Storing them in the .bss section would not save anything in this regard.

Further, if the .bss section is zero-initialized, that is not free. Whenever the operating system provides memory to back a zero-initialized section, it must clear that memory.

Upvotes: 6

Related Questions