Reputation: 179

C: How do I differentiate between an array of pointers and a pointer to an array?

Say I have the following code:

// x = whatever number

int *arr_of_ptr[x];
int （*ptr_to_arr)[x]

int **p1 = arr_of_ptr;
int **p2 = ptr_to_arr;

My understanding of arr_of_ptr is that "dereferencing an element of arr_of_ptr results in an int" - therefore the elements of arr_of_ptr are pointers to integers. On the other hand, dereferencing ptr_to_arr results in an array that I can then nab integers from, hence ptr_to_arr points to an array.

I also have a rough understanding that arrays themselves are pointers, and that arr[p] evaluates to (arr + p * sizeof(data_type_of_arr)) where the name arr decays to the pointer to the first element of arr.

So that's all well and good, but is there any way for me to tell whether p1 and p2 are pointers to arrays or arrays of pointers without prior information?

My confusion mostly stems from the fact that (I think) we can evaluate int **p two ways:

*(p + n * size) is what's giving me an int
(*p + n * size) is what's giving me an int

In hindsight this question might be poorly worded because I'm confusing myself a bit just looking back on it, but I really don't know how to articulate myself better. Sorry.

Upvotes: 4

Answers (3)

John Bode

Reputation: 123458

First,

I also have a rough understanding that arrays themselves are pointers, and that arr[p] evaluates to (arr + p * sizeof(data_type_of_arr)) where the name arr decays to the pointer to the first element of arr.

This isn't strictly correct. Arrays are not pointers. Under most circumstances, expressions of array type will be converted ("decay") to expressions of pointer type and the value of the expression will be the address of the first element of the array. That pointer value is computed as necessary and isn't stored anywhere.

Exceptions to the decay rule occur when the array expression is the operand of the sizeof, _Alignof, or unary & operators, or is a string literal used to initialize a character array in a declaration.

Having said all that, ptr_to_arr has pointer type, not array type - it will not "decay" to int **.

Given the declaration

T arr[N];

the following are true:

 Expression        Type            Decays to            Equivalent expression
 ----------        ----            ---------            ---------------------
        arr        T [N]           T *                  &arr[0]
       *arr        T               n/a                  arr[0]
     arr[i]        T               n/a                  n/a
       &arr        T (*)[N]        n/a                  n/a

The expressions arr, &arr[0], and &arr all yield the same value (modulo any differences in representation between types). arr and &arr[0] have the same type, "pointer to T" (T *), while &arr has type "pointer to N-element array of T" (T (*)[N]).

If you replace T with pointer type P *, such that the declaration is now

P *arr[N];

you get the following:

 Expression        Type            Decays to            Equivalent expression
 ----------        ----            ---------            ---------------------
        arr        P *[N]          P **                 &arr[0]
       *arr        P *             n/a                  arr[0]
     arr[i]        P *             n/a                  n/a
       &arr        P *(*)[N]       n/a                  n/a

So given your declarations, it would be more correct to write something like this:

int arr[x];
int *p1 = arr;         // the expression arr "decays" to int *

int *arr_of_ptr[x];
int **p2 = arr_of_ptr; // the expression arr_of_ptr "decays" to int **

/**
 * In the following declarations, the array expressions are operands
 * of the unary & operator, so the decay rule doesn't apply.
 */
int (*ptr_to_arr)[x] = &arr;
int *(*ptr_to_arr_of_ptr)[x] = &arr_of_ptr;

Again, ptr_to_arr and ptr_to_arr_of_ptr are pointers, not arrays, and do not decay to a different pointer type.

EDIT

From the comments:

Can I just hand-wavily explain it as： an array of pointers has a name that can decay to a pointer,

Yeah, -ish, just be aware that it is hand-wavey and not really accurate (which is shown by example below). If you are a first-year student, your institution isn't doing you any favors by making you deal with C this early. While it is the substrate upon which most of the modern computing ecosystem is built, it is an awful teaching language. Awful. Yes, it's a small language, but aspects of it are deeply unintuitive and confusing, and the interplay between arrays and pointers is one of those aspects.

an array of pointers has a name that can decay to a pointer, but a pointer to an array, even when dereferenced, does not give a give me something that decays to a pointer?

Actually...

If ptr_to_arr has type int (*)[x], then the expression *ptr_to_arr would have type int [x], which would decay to int *. The expression *ptr_to_arr_of_ptr would have type int *[x], which would decay to int **. This is why I keep using the term "expression of array type" when talking about the decay rule, rather than just the name of the array.

Something I have left out of my explanations until now - why do array expressions decay to pointers? What's the reason for this incredibly confusing behavior?

C didn't spring fully-formed from the brain of Dennis Ritchie - it was derived from an earlier language named B (which was derived from BCPL, which was derived from CPL, etc.)¹. B was a "typeless" language, where data was simply a sequence of words or "cells". Memory was modeled as a linear array of "cells". When you declared an N-element array in B, such as

auto arr[N];

the compiler would set aside all the cells necessary for the array elements, plus an extra cell that would store the numerical offset (basically, a pointer) to the first element of the array, and that cell would be bound to the variable arr:

     +---+
arr: | +-+-----------+
     +---+           |
      ...            |
     +---+           |
     |   | arr[0] <--+
     +---+
     |   | arr[1]
     +---+
      ...
     +---+
     |   | arr[N-1]
     +---+

To index into the array, you'd offset i cells from the location stored in arr and dereference the result. IOW, a[i] was exactly equivalent to *(a + i).

When Ritchie was developing the C language, he wanted to keep B's array semantics (a[i] is still exactly equivalent to *(a + i)), but for various reasons he didn't want to store that pointer to the first element. So, he got rid of it entirely. Now, when you declare an array in C, such as

int arr[N];

the only storage set aside is for the array elements themselves:

+---+
|   | arr[0]
+---+
|   | arr[1]
+---+
 ... 
+---+ 
|   | arr[N-1]
+---+

There is no separate object arr which stores a pointer to the first element (which is part of why array expressions cannot be the target of an assignment - there's nothing to assign to). Instead, that pointer value is computed as necessary when you need to subscript into the array.

This same principal holds for multi-dimensional arrays as well. Assume the following:

int a[2][2] = { { 1, 2 }, { 3, 4 } };

What you get in memory is the following:

   Viewed as int             Viewed as int [2]
   +---+                     +---+
a: | 1 | a[0][0]           a:| 1 | a[0]
   +---+                     + - +
   | 2 | a[0][1]             | 2 |
   +---+                     +---+
   | 3 | a[1][0]             | 3 | a[1]
   +---+                     + - +
   | 4 | a[1][1]             | 4 |
   +---+                     +---+

On the left we view it as a sequence of int, while on the right we view it as a sequence of int [2].

Each a[i] has type int [2], which decays to int *. The expression a itself decays from type int [2][2] to int (*)[2] (not int **).

The expression a[i][j] is exactly equivalent to *(a[i] + j), which is equivalent to *( *(a + i) + j ).

^{As detailed in The Development of the C Language}

Upvotes: 1

Jayant Jeet Tomar

Reputation: 177

#include <stdio.h>

int main(void) {
    // your code goes here
    int arr[] = {1,2,3};
    int *p1 = &arr[0];
    int *p2 = &arr[1];
    int *p3 = &arr[2];
    int* arr2[3];
    arr2[0] = p1;
    arr2[1] = p2;
    arr2[2] = p3;
    int *p4 = &arr;
    printf("%d\n", sizeof(p4));
    printf("%d\n", sizeof(arr2));
    printf("%d\n", *p4);  // not **p4
    printf("%d\n", **arr2);
    return 0;
}

In the above code arr is a normal integer array with 3 elements.
p1, p2, and p3 are normal pointers to these elements.
arr2 is an array of pointers storing p1, p2, and p3.
p4 is a pointer to array pointing to array arr

According to your question, you need to differentiate between p4 and arr2
Since, p4 is a pointer, its size is fixed (8 bytes) while size of arr2 vaires on how many elements it contains (8x3=24).

Also, to print value contained in p4 use use single dereferencing (*p4) not **p4 (illegal), while to print value contained in arr2 use use double dereferencing (**arr2).

The output of above code is :

Upvotes: 0

dbush

Reputation: 223872

The main difference is that this is legal:

int **p1 = arr_of_ptr;

While this is not:

int **p2 = ptr_to_arr;

Because arr_of_ptr is an array, it can (in most contexts) decay to a pointer to its first element. So because the elements of arr_of_ptr are of type int *, a pointer to an element has type int ** so you can assign it to p1.

ptr_to_arr however is not an array but a pointer, so there's no decaying happening. You're attempting to assign an expression of type int (*)[x] to an expression of type int **. Those types are incompatible, and if you attempt to use p2 you won't get what you expect.

Upvotes: 4

C: How do I differentiate between an array of pointers and a pointer to an array?

Answers (3)

Related Questions