Konrad
Konrad

Reputation: 41027

Initialising arrays in C++

Everywhere I look there are people who argue vociferously that uninitialised variables are bad and I certainly agree and understand why - however; my question is, are there occasions when you would not want to do this?

For example, take the code:

char arrBuffer[1024] = { '\0' };

Does NULLing the entire array create a performance impact over using the array without initialising it?

Upvotes: 5

Views: 3947

Answers (10)

Stephen Nutt
Stephen Nutt

Reputation: 3306

Personally I'm against initializing an array at created. Consider the following two pieces of code.

char buffer[1024] = {0};
for (int i = 0; i < 1000000; ++i)
{
  // Use buffer
}

vs.

for (int i = 0; i < 1000000; ++i)
{
  char buffer[1024] = {0};
  // Use buffer
}

In the first example why bother initializing buffer since the second time around the loop buffer is no longer 0 initialized? My use of buffer must work without it being initialized for all but the first iteration. All the initialization does is consume time, bloat the code and obscure bugs if typically I only go through the loop once.

While I could certainly re-factor the code as the second example, do I really want to zero initialize a buffer inside a loop if I could re-write my code so it was not necessary?

I suspect most compiler these days have options to fill uninitialized variables with non 0 values. We run all our debug builds this way so help detect use of uninitialized variables, and in release mode we turn off the option so the variables is truly uninitialized. As Sherwood Hu said, some compilers can inject code to help detect use of uninitialised variables.

Edit: In the code above I'm initializing buffer to the value 0, (not the character '0'), which is equivalent to initializing it with '\0'.

To further clarify my first code snippet, imagine the following contrived example.

char buffer[1024] = {0};
for (int i = 0; i < 1000000; ++i)
{
  // Buffer is 0 initialized, so it is fine to call strlen
  int len = strlen (buffer);
  memset (buffer, 'a', 1024);
}

The first time through the loop the buffer is initialized to 0, so strlen will return 0. The second time through the loop the buffer is no longer initialized to 0, and in fact does not contain a single 0 character, so the behaviour of strlen is undefined.

Since you have agreed with me that if buffer is initialied, moving buffer inside of the loop is not advisable, and I've showing that initializing it outside the loop offers no protection, why initialize it at all?

Upvotes: 0

Stack Overflow is garbage
Stack Overflow is garbage

Reputation: 248279

Your variables should be initialized to a meaningful value. Blindly and naively setting everything to zero isn't much better than leaving it uninitialized. It might make invalid code crash, instead of behaving unpredictably, but it won't make the code correct.

If you just naively zero out the array when creating it just to avoid uninitialized variables, it is still logically uninitialized. it doesn't yet have a value that is meaningful in your application.

If you're going to initialize variables (and you should), give them values that make sense in your application. Does the rest of your code expect the array to be zero initially? If so, set it to zero. Otherwise set it to some other meaningful value.

Or if the rest of your code expects to write to the array, without first reading to it, then by all means leave it uninitialized.

Upvotes: 0

cuteCAT
cuteCAT

Reputation: 2311

I consider that it is a bad advice to require all variables to be default initialized at the time of declaration. In most cases it is unnecessary and carries performance penalty.

For example, I often use the code below to convert a number to a string:

char s[24];
sprintf(s, "%d", int_val);

I won't write:

char s[24] = "\0";
sprintf(s, "%d", int_val);

Modern compilers are able to tell if a variable is used without being initialized.

Upvotes: 0

Kaz Dragon
Kaz Dragon

Reputation: 6809

To answer your question: it might have a performance impact. It's possible that a compiler could detect that the values of the array were unused and just not do them. It's possible.

I personally think this is a matter of personal style. I'm tempted to say: leave it uninitialised, and use a Lint-like tool to tell you if you're using it uninitialised, which is surely a bug (as opposed to using the default value and not being told, which is also a bug, but a silent one).

Upvotes: 0

paxdiablo
paxdiablo

Reputation: 882716

The rule is that variables should be set before they're used.

You do not have to explicitly initialize them on creation if you know you will be setting them elsewhere before use.

For example, the following code is perfectly okay:

int main (void) {
    int a[1000];
    : :
    for (int i =0; i < sizeof(a)/sizeof(*a); i++)
        a[i] = i;
    : :
    // Now use a[whatever] here.
    : :
    return 0;
}

In that case, it's wasteful to initialize the array at the point of its creation.

As to whether there's a performance penalty, it depends partially on where your variable is defined and partially on the execution environment.

The C standard guarantees that variables defined with static storage duration (either at file level or as statics in a function) are first initialized to a bit pattern of all zeros, then set to their respective initialized values.

It does not mandate how that second step is done. A typical way is to just have the compiler itself create the initialized variable and place it in the executable so that it's initialized by virtue of the fact that the executable is loaded. This will have no performance impact (for initialization, obviously it will have some impact for program load).

Of course, an implementation may wish to save space in the executable and initialize those variables with code (before main is called). This will have a performance impact but it's likely to be minuscule.

As to those variables with automatic storage duration (local variables and such), they're never implicitly initialized unless you assign something to them, so there will also be a performance penalty for that. By "never implicitly initialized", I mean the code segment:

void x(void) {
    int x[1000];
    ...
}

will result in x[] having indeterminate values. But since:

void x(void) {
    int x[1000] = {0};
}

may simply result in a 1000-integer memcpy-type operation (more likely memset for that case), this will likely to be fast as well. You just need to keep in mind that the initialization will happen every time that function is called.

Upvotes: 4

Arkaitz Jimenez
Arkaitz Jimenez

Reputation: 23198

I assume a stack initialization because static arrays are auto-initialized.
G++ output

   char whatever[2567] = {'\0'};
   8048530:       8d 95 f5 f5 ff ff       lea    -0xa0b(%ebp),%edx
   8048536:       b8 07 0a 00 00          mov    $0xa07,%eax
   804853b:       89 44 24 08             mov    %eax,0x8(%esp)
   804853f:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
   8048546:       00 
   8048547:       89 14 24                mov    %edx,(%esp)
   804854a:       e8 b9 fe ff ff          call   8048408 <memset@plt>

So, you initialize with {'\0'} and a call to memset is done, so yes, you have a performance hit.

Upvotes: 10

spoulson
spoulson

Reputation: 21601

If the variable is a global or static, then its data is typically stored verbatim in the compiled executable. So, your char arrBuffer[1024] will increase executable size by 1024 bytes. Initializing it will ensure the executable contains your data instead of the default 0's or whatever the compiler chooses. When the program starts, no processing is required to initialize the variables.

On the other hand, variables on the stack, such as non-static local function variables, are not stored in the executable the same way. Instead, on function entry the space is allocated on the stack and a memcpy places the data into the variable, thereby impacting performance.

Upvotes: 7

pmg
pmg

Reputation: 108986

Measure!

#include <stdio.h>
#include <time.h>

int main(void) {
  clock_t t0;
  int k;

  t0 = clock();
  for (k=0; k<1000000; k++) {
    int a[1000];
    a[420] = 420;
  }
  printf("Without init: %f secs\n", (double)(clock() - t0) / CLOCKS_PER_SEC);

  t0 = clock();
  for (k=0; k<1000000; k++) {
    int a[1000] = {0};
    a[420] = 420;
  }
  printf("   With init: %f secs\n", (double)(clock() - t0) / CLOCKS_PER_SEC);

  return 0;
}
$ gcc measure.c
$ ./a.out
Without init: 0.000000 secs
   With init: 0.280000 secs
$ gcc -O2 measure.c
$ ./a.out
Without init: 0.000000 secs
   With init: 0.000000 secs

Upvotes: 2

user212328
user212328

Reputation: 620

For large array a performance impact may be significant. Initialization of all variables by default actually doesn't offer many benefits. It's not a solution for bad code, moreover it might hide actual issues which can be caught be compiler otherwise. You need to keep track of state of all variables in their whole lifespan to make your code reliable anyway.

Upvotes: 0

Priyank Bolia
Priyank Bolia

Reputation: 14449

And why do you care for the performance benefits, how much performance you will get by not initializing it, and does it is more than the time saved during debugging due to garbage pointers.

Upvotes: -2

Related Questions