c++clanguage-lawyerstandardsmemory-model

Reputation: 1184

Is Byte Really The Minimum Addressable Unit?

Section 3.6 of C11 standard defines "byte" as "addressable unit of data storage ... to hold ... character".

Section 1.7 of C++11 standard defines "byte" as "the fundamental storage unit in the C++ memory model ... to contain ... character".

Both definitions does not say that "byte" is the minimum addressable unit. Is this because standards intentionally want to abstract from a specific machine ? Can you provide a real example of machine where C/C++ compiler were decided to have "byte" longer/shorter than the minimum addressable unit ?

Upvotes: 4

Answers (5)

sqr163

Reputation: 1184

(Thank you everyone who commented and answered, every word helps)

Memory model of a programming language and memory model of the target machine are different things.

Yes, byte is the minimum addressable unit in context of memory model of programming language.

No, byte is not the minimum addressable unit in context of memory model of machine. For example, there are machines where minimum addressable unit is longer or shorter than the "byte" of programming language:

longer: HP Saturn - 4-bit unit vs 8-bit byte gcc (thanks Nate).
shorter: IBM 7090 - 36-bit unit vs 6-bit byte (thanks Antti and Dave T.)
longer: Intel 8051 - 1-bit unit vs 8-bit byte (thanks Busybee)
longer: Ti TMS34010 - 1-bit unit vs 8-bit byte (thanks Wcochran)

Upvotes: 2

the busybee

Reputation: 12600

One example of a real machine and its compiler where the minimal addressable unit is smaller than a byte is the 8051 family. One compiler I was used to is Keil C51.

The minimal addressable unit is a bit. You can define a variable of this type, you can read and write it. However, the syntax to define the variable is non-standard. Of course, C51 needs several extensions to support all of this. BTW, pointers to bits are not allowed.

For example:

unsigned char bdata bitsAdressable;
sbit bitAddressed = bitsAdressable^5;

void f(void) {
    bitAddressed = 1;
}

bit singleBit;

void g(bit value) {
    singleBit = value;
}

Upvotes: 3

Eric Postpischil

Reputation: 222753

A byte is the smallest addressable unit in strictly conforming C code. Whether the machine on which the C implementation executes a program supports addressing smaller units is irrelevant to this; the C implementation must present a view in which bytes are the smallest addressable unit in strictly conforming C code.

A C implementation may support addressing smaller units as an extension, such as simply by defining the results of certain pointer operations that are otherwise undefined by the C standard.

Upvotes: 9

wcochran

Reputation: 10896

I programmed both the TMS34010 and its successor TMS34020 graphics chips back in the early 1990's and they had a flat address space and were bit addressable i.e. addresses indexed each bit. This was very useful for computer graphics of the time and back when memory was a lot more precious.

The embedded C-compiler didn't really have away to access individual bits directly, since from a (standard) C language point of view the byte was still the smallest unit as pointed out in a previous post.

Thus if you want to read/write a stream of bits in C, you need to read/write (at least) a byte at a time and buffer (for example when writing a Arithmetic or Huffman Coder).

Upvotes: 1

Nicol Bolas

Reputation: 473447

Both definitions does not say that "byte" is the minimum addressable unit.

That's because they don't need to. Byte-wise types (char, unsigned char, std::byte, etc) have sufficient restrictions that enforce this requirement.

The size of byte-wise types is explicitly defined to be precisely 1:

sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1.

The alignment of byte-wise types is the smallest alignment possible:

Furthermore, the narrow character types (6.9.1) shall have the weakest alignment requirement

This doesn't have to be an alignment of 1, of course. Except... it does.

See, if the alignment were higher than 1, that would mean that a simple byte array wouldn't work. Array indexing is based on pointer arithmetic, and pointer arithmetic determines the next address based on sizeof(T). But if alignof(T) is greater than sizeof(T), then the second element in any array of T would be misaligned. That's not allowed.

So even though the standard doesn't explicitly say that the alignment of bytewise types is 1, other requirements ensure that it must be.

Overall, this means that every pointer to an object has an alignment at least as restrictive as a byte-wise type. So no object pointer can be misaligned, relative to the alignment of byte-wise types. All valid, non-NULL pointers (pointers to a live object or to a past-the-end pointer) must therefore be at least aligned enough to point to a char.

Similarly, the difference between two pointers is defined in C++ as the difference between the array indices of the elements pointed to by those pointers (pointer arithmetic in C++ requires that the two pointers point into the same array). Additive pointer arithmetic is as previously stated based on the sizeof the type being pointed to.

Given all of these facts, even if an implementation has pointers whose addresses can address values smaller than char, it is functionally impossible for the C++ abstract model to generate a pointer and still have that pointer count as valid (pointing to an object/function, a past-the-end of an array, or be NULL). You could create such a pointer value with a cast from an integer. But you would be creating an invalid pointer value.

So while technically there could be smaller addresses on the machine, you could never actually use them in a valid, well-formed C++ program.

Obviously compiler extensions could do anything. But as far as conforming programs are concerned, it simply isn't possible to generate valid pointers that are misaligned for byte-wise types.

Upvotes: 2

Is Byte Really The Minimum Addressable Unit?

Answers (5)

Related Questions