The array data type in C

Question

By reading some details about pointers and arrays in C I got a little confused. On the one hand, the array can be seen as a data type. On the other hand, the array tends to be an unmodifiable lvalue. I imagine that the compiler will do something like replacing the array's identifier with a constant address and an expression for calculating the position given by the index at runtime.

myArray[3] -(compiler)-> AE8349F + 3 * sizeof()

When saying that an array is a data type, what does this exactly mean? I hope you can help me to clarify my confused understanding of what an array really is and how it is treated by the compiler.

haccks · Accepted Answer

When speaking about that an array is a data type, what does this exactly mean?

A data type is a set of data with values having predefined characteristics. Examples of data types are: integer, floating point unit number, character, string, and pointer

An array is a group of memory locations related by the fact that they all have the same name and the same type.

If you are wondering why array is not modifiable then best explanation I have ever read is;

C didn't spring fully formed from the mind of Dennis Ritchie; it was derived from an earlier language known as B (which was derived from BCPL).¹ B was a "typeless" language; it didn't have different types for integers, floats, text, records, etc. Instead, everything was simply a fixed length word or "cell" (essentially an unsigned integer). Memory was treated as a linear array of cells. When you allocated an array in B, such as

auto V[10];

the compiler allocated 11 cells; 10 contiguous cells for the array itself, plus a cell that was bound to V containing the location of the first cell:

    +----+
V:  |    | -----+
    +----+      |
     ...        |
    +----+      |
    |    | <----+
    +----+
    |    |
    +----+
    |    |      
    +----+
    |    |
    +----+
     ...

When Ritchie was adding struct types to C, he realized that this arrangement was causing him some problems. For example, he wanted to create a struct type to represent an entry in a file or directory table:

struct {
  int inumber;
  char name[14];
};

He wanted the structure to not just describe the entry in an abstract manner, but also to represent the bits in the actual file table entry, which didn't have an extra cell or word to store the location of the first element in the array. So he got rid of it - instead of setting aside a separate location to store the address of the first element, he wrote C such that the address of the first element would be computed when the array expression was evaluated.

This is why you can't do something like

int a[N], b[N];
a = b;

because both a and b evaluate to pointer values in that context; it's equivalent to writing 3 = 4. There's nothing in memory that actually stores the address of the first element in the array; the compiler simply computes it during the translation phase.

_{1. This is all taken from the paper The Development of the C Language}

For more detail you may like to read this answer.

EDIT: For more clarity; Difference between modifiable l-value, non-modifiable l-value & r-value (in short);

The difference among these kinds of expressions is this:

A modifiable l-value is addressable (can be the operand of unary &) and assignable (can be the left operand of =).

A non-modifiable l-value is addressable, but not assignable.

An r-value is neither addressable nor assignable.

The array data type in C

Answers (2)

Related Questions