Reputation: 4687
In c (or maybe c++) , what's the difference between
char myarr[16]={0x00};
and
char myarr[16]; memset(myarr, '\0', sizeof(myarr));
??
edit: I ask this because in vc++ 2005 the result is the same..
edit more :
and
char myarr[16]={0x00,};?
Upvotes: 18
Views: 10191
Reputation: 71516
Defining initial values in the variable declaration happens at a different place than using memset.
For the former case the zeros are defined in some form in the binary as zero init memory (or non-zero depending on what you initialize to), and you hope that the loader honors that, has ABSOLUTELY nothing to do with C language standards. The latter, using memset depends on the C library which you would also works. I have more faith in the library.
I do a lot of embedded code where you learn to avoid the bad habit of initializing variables as part of the variable declaration and instead do it within the code.
For standard operating systems, Linux, Windows, etc the init during variable declaration is fine, you will get an imperceptible performance increase, but if you are running an operating system you are on a platform that is fast enough to not see that difference.
Depending on the binary type the former case of the init during declaration can make the binary larger. This is extremely easy to test for. Compile your binary as above, then change the array size from [16] to [16000] then compile again. Then compile without the = {0x00} and compare the three binary sizes.
For most systems that most programmers will ever see, there is no functional difference. I recommend the memset as a habit. Despite what standards say many if not most C compilers (of which most programmers will never see in their careers) wont like that init because the number of elements doesnt match the size. Most compilers do not conform to the standards even if they claim to. Instead develop good habits that avoid shortcuts or pretty much anything that should work for standard X but is different from the prior standard M. (Avoid any gee whiz compiler or standards based tricks).
Upvotes: 1
Reputation: 25581
Given the hard-to-dispute fact that = { 0 }
is infinitely more readable than memset(..., ..., ... sizeof ...)
, then the following would discourage explicitly using memset
:
In Visual Studio 2005, compiling for Windows Mobile, full optimized release build:
; DWORD a[10] = { 0 };
mov r3, #0
mov r2, #0x24
mov r1, #0
add r0, sp, #4
str r3, [sp]
bl memset
add r4, sp, #0
mov r5, #0xA
; DWORD b[10];
; memset(b, 0, sizeof(b));
mov r2, #0x28
mov r1, #0
add r0, sp, #0x28
bl memset
add r4, sp, #0x28
mov r5, #0xA
Pretty much the same.
Upvotes: 3
Reputation: 31708
Practically they're the same. The first form is guaranteed to init the whole type to 0x00 (even padding space between structure elements for example), and this is defined since C90. Unfortunately gcc gives a warning for the first form with the -Wmissing-field-initializers option. More details here:
http://www.pixelbeat.org/programming/gcc/auto_init.html
Upvotes: 0
Reputation: 506857
The important difference is that the first default initializes the array in an element-specific manner: Pointers will receive a null pointer value, which doesn't need to be 0x00 (as in all-bits-zero), booleans will be false. If the element type is a class type that's not a so-called POD (plain old data-type), then you can only do the first one, because the second one only works for the simplest cases (where you don't have virtual functions, user defined constructors and so on). In contrast, the second way using the memset sets all elements of the array to all-bits-zero. That is not always that what you want. If your array has pointers for example, they won't be set to null-pointers necessarily.
The first will default initialize the elements of the array, except for the first one, which is set to 0 explicitly. If the array is local and on the stack (that is, not a static), the compiler internally often does a memset to clear the array out. If the array is non-local or static, the first version can be considerably more efficient. The compiler can put the initializers already, at compile time, into the generated assembler code, making it require no runtime code at all. Alternatively, the array can be laid out on a section that is automatically zero'd out (also for pointers, if they have a all-bits-zero representation) when the program starts in a fast manner (i.e page-wise).
The second does a memset explicitly over the whole array. Optimizing compilers will usually replace a memset for smaller regions with inline machine code that just loops using labels and branches.
Here is assembler-code generated for the first case. My gcc stuff isn't much optimized, so we got a real call to memset (16 bytes at the stack-top are always allocated, even if we got no locals. $n is a register number):
void f(void) {
int a[16] = { 42 };
}
sub $29, $29, 88 ; create stack-frame, 88 bytes
stw $31, $29, 84 ; save return address
add $4, $29, 16 ; 1st argument is destination, the array.
add $5, $0, 0 ; 2nd argument is value to fill
add $6, $0, 64 ; 3rd argument is size to fill: 4byte * 16
jal memset ; call memset
add $2, $0, 42 ; set first element, a[0], to 42
stw $2, $29, 16 ;
ldw $31, $29, 84 ; restore return address
add $29, $29, 88 ; destroy stack-frame
jr $31 ; return to caller
The gory details from the C++ Standard. The first case above will default-initialize remaining elements.
8.5
:
To zero-initialize storage for an object of type T means:
- if T is a scalar type, the storage is set to the value of 0 (zero) converted to T;
- if T is a non-union class type, the storage for each nonstatic data member and each base-class subobject is zero-initialized;
- if T is a union type, the storage for its first data member is zero-initialized;
- if T is an array type, the storage for each element is zero-initialized;
- if T is a reference type, no initialization is performed.
To default-initialize an object of type T means:
- if T is a non-POD class type, the default constructor for T is called
- if T is an array type, each element is default-initialized;
- otherwise, the storage for the object is zero-initialized.
8.5.1
:
If there are fewer initializers in the list than there are members in the aggregate, then each member not explicitly initialized shall be default-initialized (8.5).
Upvotes: 20
Reputation: 169543
ISO/IEC 9899:TC3 6.7.8, paragraph 21:
If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
Arrays with static storage duration are initialized to 0
, so the C99 spec guarantees the not explicitly initialized array elements to be set to 0
as well.
In my first edit to this post, I spouted some nonsense about using compound literals to assign to an array after initialization. That does not work. If you really want to use compound literals to set an array's values, you have to do something like this:
#define count(ARRAY) (sizeof(ARRAY)/sizeof(*ARRAY))
int foo[16];
memcpy(foo, ((int [count(foo)]){ 1, 2, 3 }), sizeof(foo));
With some macro magic and the non-standard __typeof__
operator, this can be considerably shortened:
#define set_array(ARRAY, ...) \
memcpy(ARRAY, ((__typeof__(ARRAY)){ __VA_ARGS__ }), sizeof(ARRAY))
int foo[16];
set_array(foo, 1, 2, 3);
Upvotes: 16
Reputation: 140032
Perhaps char myarr[16]={0x00};
isn't a good example to begin with, since both the explicit and implicit member initializations use zeros, making it harder to explain what's happening in that situation. I thought that a real-life example, with non-zero values could be more illustrative:
/**
* Map of characters allowed in a URL
*
* !, \, (, ), *, -, ., 0-9, A-Z, _, a-z, ~
*
* Allowed characters are set to non-zero (themselves, for easier tracking)
*/
static const char ALLOWED_IN_URL[256] = {
/* 0 1 2 3 4 5 6 7 8 9*/
/* 0 */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
/* 10 */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
/* 20 */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
/* 30 */ 0, 0, 0, '!', 0, 0, 0, 0, 0, '\'',
/* 40 */ '(', ')', '*', 0, 0, '-', '.', 0, '0', '1',
/* 50 */ '2', '3', '4', '5', '6', '7', '8', '9', 0, 0,
/* 60 */ 0, 0, 0, 0, 0, 'A', 'B', 'C', 'D', 'E',
/* 70 */ 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
/* 80 */ 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y',
/* 90 */ 'Z', 0, 0, 0, 0, '_', 0, 'a', 'b', 'c',
/* 100 */ 'd', 'e', 'f', 'g' , 'h', 'i', 'j', 'k', 'l', 'm',
/* 110 */ 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
/* 120 */ 'x', 'y', 'z', 0, 0, 0, '~',
};
This is a lookup table that can be used when URL-encoding a string. Only the characters that are allowed in a URL are set to a non-zero value. A zero means that the character is not allowed and needs to be URL-encoded (%xx
). Notice that the table abruptly ends with a comma after the tilde character. None of the characters following the tilde are allowed and so should be set to zero. But instead of writing many more zeros to fill the table up to 256 entries, we let the compiler implicitly initialize the rest of the entries to zero.
Upvotes: 6