Reputation: 93
int main()
{
char c1[5]="abcde";
char c2[5]={'a','b','c','d','e'};
char *s1 = c1;
char *s2 = c2;
printf("%s",s1);
printf("%s",s2);
return 0;
}
In this code snippet, the char array C2 doesn't return any error but the char array C1 returns string too long. I know that C1 must require a size of 6 to store 5 characters as it stores the \0
(NULL char) in the last index. But I'm confused why C2 works just fine then?
Also, when C2 is printed using %s
, the output is abcde@
where @
is a gibberish character. %s
with printf
prints all the characters starting from the given address till \0
is encountered. I don't understand why is it printing that extra character at the end?
Upvotes: 0
Views: 1295
Reputation: 222372
I know that C1 must require a size of 6 to store 5 characters as it stores the
\0
(NULL char) in the last index. But I'm confused why C2 works just fine then?
The compiler does not complain about the initialization of c2
because initializing with {'a','b','c','d','e'}
does not implicitly include a terminating null character.
In contrast, initializing with "abcde"
does include a null character: The C standard defines a string literal to include a terminating null character, so char c1[5]="abcde";
nominally initializes a 5-element array with 6 values. The C standard does not require a warning or error in this case because C 2018 6.7.9 14 indicates that null character may be neglected if the array does not have room for it. However, the compiler you are using1 has chosen to issue a warning message because this form of initialization often indicates an error: The programmer attempted to initialize an array with a string, but there is not room for the full string.
In C, arrays of characters and strings are different things: An array is a sequence of values, and an array of characters can contain any arbitrary values of those characters, including no zero value at the end and possible zero values in the middle. For example, if we have a buffer of bytes from a binary file, the bytes are just integer values to us; their meaning as characters that might be printed is irrelevant. A string is a sequence of characters that is terminated by a null character. It cannot have internal zero values because the first null character marks the end.
So, when you define an array of characters such as char c1[5]
, the compiler does not automatically know whether you intend to use it to hold strings or you intended to use it as an array of arbitrary values. When you initialize the array with a string, your compiler is essentially figuring you intend to use the array to hold strings, and it warns you if the string you use to initialize the array does not fit. When you initialize the array with a list of values, your compiler essentially figures you may be using it to hold arbitrary values, and it does not warn you that there could be a missing terminator.
Also, when C2 is printed using
%s
, the output isabcde@
where@
is a gibberish character.
Because c2
does not have a terminating character, attempting to print it runs off the end of the array, resulting in behavior not defined by the C standard. Commonly, printf
continues reading memory beyond the array, printing whatever happens to be there until it reaches a null character.
1 This assumes you are indeed using a C compiler to compile this source code. C++ has different rules and does not permit an array being initialized with a string literal to be too short to include the terminating null character.
Upvotes: 2
Reputation: 224864
You've created two unterminated strings. Make your arrays big enough to hold the null terminator and you'll avoid this undefined behaviour:
char c1[6] = "abcde";
char c2[6] = {'a','b','c','d','e','\0'};
Strictly, speaking the latter doesn't actually require the '\0'
. This declaration is equivalent and will include the null terminator:
char c2[6] = {'a','b','c','d','e'};
I personally prefer the first form, but with the added convenience of being able to leave out the explicit length:
char c1[] = "abcde";
Upvotes: 3