OckhamsRazor
OckhamsRazor

Reputation: 4906

C Strings: Simple Question

I have three variables initialised below :

char c1[] = "Hello";
char c2[] = { 'H', 'e', 'l', 'l', 'o', '\0'};
char* c3 = "Hello";

I am aware that c1 and c2 are the same, and that they are both strings because they are terminated by \0. However, c3 is different from c1 and c2. Is this because c3 does not terminate with a \0? Does that mean that c3 is not a string? If c3 is not a string, then why does printf("%s", c3); not give an error? Thanks!

EDIT:

Is there a reason why c1 and c2 can be modified but c3 can't?

Upvotes: 6

Views: 344

Answers (9)

paxdiablo
paxdiablo

Reputation: 881423

In terms of C, the most pertinent difference between c3 and the others is that you are not allowed to attempt to modify the underlying characters with c3. I often find it helpful to think of it like this:

char *xyz = "xyz";

will create a modifiable pointer on the stack and make it point at the non-modifiable character sequence {'x','y','z','\0'}. On the other hand,

char xyz[] = "xyz";

will create a modifiable array on the stack big enough to hold the character sequence {'x','y','z','\0'} and then copy that character sequence into it. The array contents will then be modifiable. Keep in mind the standard says nothing about stacks but this is commonly how it's done. It is just a memory aid, after all.

Formally, c3 is a pointer to a string literal while c1 and c2 are both arrays of characters which both happen to end with a null character. When they're passed to functions like printf, they decay to a pointer to the first element of the array which means they'll be treated identically to c3 within that function (actually they decay under quite a few circumstances, see third quote from c99 below for exceptions).

The relevant sections of C99 are 6.4.5 String literals which explains why you're not allowed to modify what c3 points to:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

and why it does have a null terminator:

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals.

And 6.3.2.1 Lvalues, arrays, and function designators under 6.3 Conversions states:

Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

Upvotes: 10

Master C
Master C

Reputation: 1546

C3 is a pointer to the first cell at the string. C1,C2 are just a regular array that aren't pointed by someone.

Upvotes: 0

iammilind
iammilind

Reputation: 69988

First point,

char* c3 = "Hello"; // may be valid C, but bad C++!

is an error prone style, so don't use it. Instead use,

const char* c3 = "Hello";

It's a valid code. Pointer c3 points to the address of the location where "Hello" string is stored. But you cannot modify *c3 (i.e. content of c3) as earlier cases (if you do so, it's an undefined behavior).

Upvotes: 5

check123
check123

Reputation: 2009

It is a string It points to a string, but it is risky.

Upvotes: 0

sanjoyd
sanjoyd

Reputation: 3340

In C, the constant "string" can have two meanings, based on which context it is used. It can either denote a string in the ro section of the executable (though I don't think the standard spells that out), making const char *foo = "bar" a statement initializing foo to point to a location in the loaded executable's memory. If the binary blob ("bar") is indeed in the ro section, and you do something like foo[0] = 'x', you'll end up with a SIGSEGV.

However, when you write char x[] = "Hello" (or char x[6] = "Hello"), you're using "Hello" as an array initializer (like int x[2] = { 1, 2 }), and the x is just a regular (writable) array allocated on the stack. In this case, "Hello" is just a shorthand for {'H', 'e', 'l', 'l', 'o', '\0' }.

Both "bar" and "Hello" are null terminated.

Upvotes: 1

Michael Burr
Michael Burr

Reputation: 340208

c3 is a pointer to a string, which is what printf("%s", ...) expects as its argument.

The reason that printf("%s", c1) or printf("%s", c2) would also work is because in C arrays 'decay' to pointers very easily in expressions. In fact, about the only times that an array name doesn't decay into a pointer in an expression is when it's used as the operand to the sizeof operator or the operand to the & (address-of) operator.

This leads to common confusion that pointers and arrays are equivalent in C, which is not correct. It's just that in C arrays can be used nearly everywhere pointers are. One exception is that they can't be assigned to, except when they're subscripted (which turns out is an expression that treats them as a pointer).

Note that there's one other difference in the last string - since it's a sting literal, it cannot be modified (it's undefined what will happen if you try). `

Upvotes: 3

mpen
mpen

Reputation: 282875

c1 and c2 allocate 6 bytes of memory and store the null-terminated string in it.

c3, however, allocates the (also null-terminated) string in program memory, and creates a pointer to it, i.e., the string is stored with the other instructions rather than on the stack (or heap? someone correct me) so editing it would be unsafe.

Upvotes: 1

shernshiou
shernshiou

Reputation: 518

It is a pointer to a string with different termination.

Upvotes: 0

wallyk
wallyk

Reputation: 57774

c3 is not terminated by a NUL or a NULL. It is a pointer to a string terminated by a NUL.

Upvotes: 0

Related Questions