Shrikant Giridhar
Shrikant Giridhar

Reputation: 399

Why are datatypes self-aligned?

I understand data structure alignment restrictions exist to optimize memory accesses because modern CPUs fetch memory in word-sized (or multiples of word-size) chunks. This would make me think that the that most optimal way to align data is to (fixed) word boundaries.

For example, consider the following structs on a 32-bit machine (compiled with gcc v6.2.0; CFLAGS: -Wall -g -std=c99 -pedantic):

struct layoutA {
    char a;     /* start: 0; end: 1; padding: 3 */
    uint32 b;   /* start: 4; end: 8; padding: 0 */
    uint64 c;   /* start: 8; end: 16; padding: 0 */
};

/* sizeof(struct layoutA) = 16 */

struct layoutB {
    uint32 b;    /* start: 0; end: 4; padding: 4 */
    uint64 c;    /* start: 8; end: 16; padding: 0 */
    char a;      /* start: 16; end: 0; padding: 3 */
};

/* sizeof(struct layoutB) = 24 */

Due to the self-alignment restriction, c forces the second struct to align itself to the 8-byte boundary instead of the word boundary (4-byte).

How does this reconcile with the original reason for alignment - memory optimization. It would appear that placing c at 4 should also help the CPU read it in 2 accesses (similar to the current case where it needs to access 2 words (at 8 and 12) to get the entire doubleword.

How does self-alignment optimize memory access? In other words, what benefit do we gain in the second case to justify the losing the space due to self-alignment?

Upvotes: 2

Views: 914

Answers (1)

chqrlie
chqrlie

Reputation: 145317

Alignment is implementation specific. Its primary purpose it not optimisation: on some architectures, word accesses must be aligned or they invoke undefined behavior.

On Intel architectures, most unaligned accesses can be configured to work correctly, but programmer should not rely on that and compilers certainly don't. When unaligned accesses are supported, they are usually slower than aligned accesses, hence the optimisation effect.

If type uint64_t happens to require self-alignment, as seems to be the case on the target system, the layout for struct layoutB uses more memory than struct layoutA, but both require 64-bit alignment.

The benefit we get from self-alignment is code correctness. On a 32 bit architecture that does not require self-alignment of 64-bit integer variables, it is optional but you still get an advantage as both 32-bit parts would come from the same cache line.

You could use packing attributes or pragmas to force a specific layout and run benchmarks to assess the impact on your target system. It is tricky and may or may not show a difference.

A a conclusion: alignment is implementation defined and should be left to the compiler, but careful ordering of structure members may yield better memory usage and significant savings for large arrays of structures.

Upvotes: 3

Related Questions