Reputation: 277
This program works in C:
#include <stdio.h>
int main(void) {
char a[10] = "Hello";
char *b = a;
printf("%s",b);
}
There are two things I would expect to be different. One is that we in the second line in the main write: "char *b = &a", then the program is like this:
#include <stdio.h>
int main(void) {
char a[10] = "Hello";
char *b = &a;
printf("%s",b);
}
But this does not work. Why is that? Isn't this the correct way to initialize a pointer with an adress?
The second problem I have is in the last line we should have: printf("%s",*b) so the program is like this:
#include <stdio.h>
int main(void) {
char a[10] = "Hello";
char *b = a;
printf("%s",*b);
}
But this gives a segmentation fault. Why does this not work? Aren't we supposed to write "*" in front of a pointer to get its value?
Upvotes: 0
Views: 143
Reputation: 123468
Expanding on Steve's answer (which is the correct one to accept)...
This is the special rule he's talking about:
6.3.2.1 Lvalues, arrays, and function designatorsC 2011 Prepublication Draft
...
3 Except when it is the operand of thesizeof
operator, the_Alignof
operator, or the unary&
operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
Arrays are weird and don't behave like other types. You don't get this "decay to a pointer to the first element" behavior in other aggregate types like struct
types. You can't assign the contents of an entire array with the =
operator like you can with struct
types; for example, you can't do something like
int a[5] = {1, 2, 3, 4, 5};
int b[5];
...
b = a; // not allowed; that's what "is not an lvalue" means
Why are arrays weird?
C was derived from an earlier language named B, and when you declared an array in B:
auto arr[5];
the compiler set aside an extra word to point to the first element of the array:
+---+
arr: | | ----------+
+---+ |
... |
+---+ |
| | arr[0] <--+
+---+
| | arr[1]
+---+
| | arr[2]
+---+
| | arr[3]
+---+
| | arr[4]
+---+
The array subscript operation arr[i]
was defined as *(arr + i)
- given the starting address stored in arr
, offset i
elements from that address and dereference the result. This also meant that &arr
would yield a different value from &arr[0]
.
When he was designing C, Ritchie wanted to keep B's array subscripting behavior, but he didn't want to set aside storage for the separate pointer that behavior required. So instead of storing a separate pointer, he created the "decay" rule. When you declare an array in C:
int arr[5];
the only storage set aside is for the array elements themselves:
+---+
arr: | | arr[0]
+---+
| | arr[1]
+---+
| | arr[2]
+---+
| | arr[3]
+---+
| | arr[4]
+---+
The subscript operation arr[i]
is still defined as *(arr + i)
, but instead of storing a pointer value in arr
, a pointer value is computed from the expression arr
. This means &arr
and &arr[0]
will yield the same address value, but the types of the expressions will be different (int (*)[5]
vs int *
, respectively).
One practical effect of this rule is that you can use the []
operator on pointer expressions as well as array expressions - given your code you can write b[i]
and it will behave exactly like a[i]
.
Another practical effect is that when you pass an array expression as an argument to a function, what the function actually receives is a pointer to the first element. This is why you often have to pass the array size as a separate parameter, because a pointer only points to a single object of the specified type; there's no way to know from the pointer value itself whether you're pointing to the first element of an array, how many elements are in the array, etc.
Arrays carry no metadata around, so there's no way to query an array for its size, or type, or anything else at runtime. The sizeof
operator is computed at compile time, not runtime.
Upvotes: 2
Reputation: 47952
There is a special rule in C. When you write
char *b = a;
you get the same effect as if you had written
char *b = &a[0];
That is, you automatically get a pointer to the array's first element. This happens any time you try to take the "value" of an array.
Aren't we supposed to write "*" in front of a pointer to get its value?
Yes, and if you wanted to get the single character pointed to by b
, you would therefore need the *
. This code
printf("first char: %c\n", *b);
would print the first character of the string. But when you write
printf("whole string: %s\n", b);
you get the whole string. %s
prints multiple characters, and it expects a pointer. Down inside printf
, when you use %s
, it loops over and prints all the characters in the string.
Upvotes: 3