Reputation: 338
Strings can be initialized with a string literal
char word1[] = "abc";
or as a char array with a null terminator.
char word2[] = {'a', 'b', 'c', '\0'};
Instead of writing word1[]
, word1 can also be written with a pointer notation
char *word1 = "abc";
However, when trying to write word2 with a pointer notation
char *word2 = {'a', 'b', 'c', '\0'};
it shows me a bunch of warnings, such as
warning: excess elements in scalar initializer char *word2 = {'a', 'b', 'c', '\0'};
and when I run the program, I get Segmentation fault (core dumped)
.
Why is that? Why can you write char *word = "abc"
but not char *word = {'a', 'b', 'c', '\0'}
?
Upvotes: 5
Views: 976
Reputation: 47962
There's no fundamental reason for this -- it's just the way the language was originaly defined.
The basic syntax for array initialization is
type array[] = {value, value, value};
The basic syntax for pointer initialization is
type *pointer = value;
But then we have string literals. And it turns out that, deep down inside, the compiler does two almost completely different things with string literals.
If you say
char array[] = "string";
the compiler treats it just about exactly as if you had said
char array[] = { 's', 't', 'r', 'i', 'n', 'g', '\0' };
But if you say
char *p = "string";
the compiler does something quite different. It quietly creates an array for you, containing the string, more or less as if you had written
char __hidden_unnamed_array[] = "string";
char *p = __hidden_unnamed_array;
But the point -- the answer to your question -- is that the compiler does this special thing only for string literals. In the original definition of C, at least, there was no way to use the {value, value, value}
syntax to create a hidden, unnamed array that you could do something else with. The {value, value, value}
syntax was only defined as working as the direct initializer for an explicitly-declared array.
As @pmg mentions in a comment, however, newer versions of C have a new syntax, the compound literal, which does let you, basically, "use the {value, value, value}
syntax to create a hidden, unnamed array to do something else with". So you can in fact write
char *word2 = (char[]){'a', 'b', 'c', '\0'};
and this works just fine. It works in other contexts, too: for example, you can say things like
printf("%s\n", (char[]){'d', 'e', 'f', '\0'});
Going back to a side question you asked: when you wrote
char *word2 = {'a', 'b', 'c', '\0'};
the compiler said to itself, "Wait a minute, word2
is one thing, but the initializer has four things. So I'll throw away three, and warn the programmer that I'm doing so." It then did the equivalent of
char *word2 = {'a'};
and if you later tried something like
printf("%s", word2);
you got a crash when printf
tried to access address 0x00000061.
Upvotes: 5
Reputation: 222933
Why can you initialize a string pointer as a string literal, but not as an array?
Because {'a', 'b', 'c', '\0'}
is not an array; it is a list of values to put in the thing being initialized.
The syntax {'a', 'b', 'c', '\0'}
does not stand for an array in C. People see it being used to initialize arrays, but, when used in that way, it is just a list of values. It could also be used to initialize a structure, because it is just listing values to put into the thing being initialized. It is not, by itself, an array.
In char *word2 = {'a', 'b', 'c', '\0'};
, it does not make sense to initialize word2
with the values 'a'
, 'b'
, 'c'
, and '\0'
. It is just one pointer and should be initialized with one value. Giving a list of four values to initialize one thing does not make sense.
In char *word2 = "abc";
, "abc"
is not a list of values. It is a string literal. A string literal defines a static array that is filled with the characters of the string. And then the string literal is automatically converted to a pointer to its first element, and it is this pointer that is used to initialize word2
.
So char *word2 = "abc";
does two things: The string literal defines an array, and the initialization sets word2
to point to the first element of that array. In contrast, in char *word2 = {'a', 'b', 'c', '\0'};
, there is nothing to define an array; the list of values is just a list of values.
Comparing this to array initializations, in char word2[] = {'a', 'b', 'c', '\0'};
, the array is initialized with a list of values, which is fine. However, in char word1[] = "abc";
, something special happens. C 2018 6.7.9 14 says we can initialize an array of character type with a string literal, and the characters of the string will be used to initialize the elements of the array.
Upvotes: 9
Reputation: 224102
In general, the type of the initializer must match the type of what is being initialized.
This works:
char *word1 = "abc";
Because a string constant has type array of char
and such an array decays to type char *
when used in an expression or initialization, so this matches the declared type.
This works:
char word2[] = {'a', 'b', 'c', '\0'};
Because an array of char
is being initialized with an initializer list of characters (technically they have type int
but are converted to char
).
This gives a warning:
char *word2 = {'a', 'b', 'c', '\0'};
Because an initializer list is being used to initialize a type which is not an array or struct.
And this is OK:
char word1[] = "abc";
Because the C standard specifically allows initializing a char
array with a string literal, as specified in section 6.7.9p14:
An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
Upvotes: 3