user394334
user394334

Reputation: 277

Arrays and pointers in C, two questions

This program works in C:

#include <stdio.h>


int main(void) {
    char a[10] = "Hello";
    char *b = a;
    
    printf("%s",b);
}

There are two things I would expect to be different. One is that we in the second line in the main write: "char *b = &a", then the program is like this:

#include <stdio.h>


int main(void) {
    char a[10] = "Hello";
    char *b = &a;
    
    printf("%s",b);
}

But this does not work. Why is that? Isn't this the correct way to initialize a pointer with an adress?

The second problem I have is in the last line we should have: printf("%s",*b) so the program is like this:

#include <stdio.h>


int main(void) {
    char a[10] = "Hello";
    char *b = a;
    
    printf("%s",*b);
}

But this gives a segmentation fault. Why does this not work? Aren't we supposed to write "*" in front of a pointer to get its value?

Upvotes: 0

Views: 143

Answers (2)

John Bode
John Bode

Reputation: 123468

Expanding on Steve's answer (which is the correct one to accept)...

This is the special rule he's talking about:

6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
C 2011 Prepublication Draft

Arrays are weird and don't behave like other types. You don't get this "decay to a pointer to the first element" behavior in other aggregate types like struct types. You can't assign the contents of an entire array with the = operator like you can with struct types; for example, you can't do something like

int a[5] = {1, 2, 3, 4, 5};
int b[5];
...
b = a; // not allowed; that's what "is not an lvalue" means

Why are arrays weird?

C was derived from an earlier language named B, and when you declared an array in B:

auto arr[5];

the compiler set aside an extra word to point to the first element of the array:

     +---+
arr: |   | ----------+
     +---+           |
      ...            |
     +---+           |
     |   | arr[0] <--+
     +---+
     |   | arr[1]
     +---+
     |   | arr[2]
     +---+
     |   | arr[3]
     +---+
     |   | arr[4]
     +---+

The array subscript operation arr[i] was defined as *(arr + i) - given the starting address stored in arr, offset i elements from that address and dereference the result. This also meant that &arr would yield a different value from &arr[0].

When he was designing C, Ritchie wanted to keep B's array subscripting behavior, but he didn't want to set aside storage for the separate pointer that behavior required. So instead of storing a separate pointer, he created the "decay" rule. When you declare an array in C:

int arr[5];

the only storage set aside is for the array elements themselves:

     +---+
arr: |   | arr[0]
     +---+ 
     |   | arr[1]
     +---+
     |   | arr[2]
     +---+
     |   | arr[3]
     +---+
     |   | arr[4]
     +---+

The subscript operation arr[i] is still defined as *(arr + i), but instead of storing a pointer value in arr, a pointer value is computed from the expression arr. This means &arr and &arr[0] will yield the same address value, but the types of the expressions will be different (int (*)[5] vs int *, respectively).

One practical effect of this rule is that you can use the [] operator on pointer expressions as well as array expressions - given your code you can write b[i] and it will behave exactly like a[i].

Another practical effect is that when you pass an array expression as an argument to a function, what the function actually receives is a pointer to the first element. This is why you often have to pass the array size as a separate parameter, because a pointer only points to a single object of the specified type; there's no way to know from the pointer value itself whether you're pointing to the first element of an array, how many elements are in the array, etc.

Arrays carry no metadata around, so there's no way to query an array for its size, or type, or anything else at runtime. The sizeof operator is computed at compile time, not runtime.

Upvotes: 2

Steve Summit
Steve Summit

Reputation: 47952

There is a special rule in C. When you write

char *b = a;

you get the same effect as if you had written

char *b = &a[0];

That is, you automatically get a pointer to the array's first element. This happens any time you try to take the "value" of an array.

Aren't we supposed to write "*" in front of a pointer to get its value?

Yes, and if you wanted to get the single character pointed to by b, you would therefore need the *. This code

printf("first char: %c\n", *b);

would print the first character of the string. But when you write

printf("whole string: %s\n", b);

you get the whole string. %s prints multiple characters, and it expects a pointer. Down inside printf, when you use %s, it loops over and prints all the characters in the string.

Upvotes: 3

Related Questions