Mark Galeck
Mark Galeck

Reputation: 6405

objects of which types can always be allocated at page boundary?

On Linux, I am interested to know if at page boundary, some object type can always be allocated. For which C types, is this always guaranteed? Please point me to standard/documentation from which the answer follows.

Clarifying code:

#include <sys/mman.h>
#include <string.h>

typedef char TYPE;

int main()
{
    TYPE val;
    void *mem;
    mem = mmap(NULL, sizeof(val), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    memcpy(mem, &val, sizeof(val));

    /* is *(TYPE*)mem guaranteed to be the same as val ? */
}

The answer for char is yes, guaranteed.

For which other types is the answer yes, guaranteed?

Upvotes: 0

Views: 105

Answers (1)

Blabbo the Verbose
Blabbo the Verbose

Reputation: 236

Object alignment requirements are determined by the hardware architecture (processor being used), because it is the processor that may not be able to load data from unaligned addresses. (In some cases, the kernel can provide unaligned address support by trapping the processor interrupt generated by an access attempt via an unaligned address, and emulate the instruction. This is slow, however.)

The hardware architectures supported by the Linux kernel are listed at https://www.kernel.org/doc/html/latest/arch.html.

We can summarize these by saying that there is no hardware architecture supported by Linux that requires more than 128 bytes of alignment for any native data type supported by the processor, and page sizes are a multiple of 512 bytes (for historical reasons), so we can categorically say that on Linux, all data primitives can be accessed at page-aligned addresses.

(In Linux, you can use sysconf(_SC_PAGESIZE) to obtain the page size. Note that if huge pages are supported, they are larger, a multiple of this value.)

The above covers all C data types defined by the GNU C standard library and its extensions, because they only define scalar types, structures with elements aligned at natural boundaries (not packed), and vectorized types (using GCC/Clang vector extensions designed for SIMD architectures).

(You can define a packed structure that has to be allocated at a non-aligned address if you are really evil, using GCC type or variable attributes.)

If we look at the C standard, the type max_align_t (provided in <stddef.h> by the compiler; it is available even in freestanding environments where the standard C library is not available) has the maximum alignment needed for any object (except for evilly constructed packed structures mentioned above).

This means that _Alignof (max_align_t) tells you the maximum alignment required for the types the C standard defines. (Do remember to tell your C compiler the code uses features provided by the C11 standard or later, e.g. -std=c11 or -std=gnu2x.)

However, certain architectures with SIMD (single instruction, multiple data –– for which GCC and other C compilers have added vector extensions, for example implementing the Intel <immintrin.h> MMX/SSE/AVX etc. intrinsics), may require larger alignment for vector registers, up to the size of the vector registers. (This is where that 512 bits comes from.) On x86-64 (64-bit Intel and AMD architectures currently used) there are separate instructions for unaligned and aligned accesses, with unaligned accesses possibly slower than aligned accesses, depending on the exact processor. So, _Alignof(max_align_t) does not apply to these vectorized types, using vector extensions to the C standard. The C standard itself refers to such types as "requiring extended alignment".

In Linux, all types passed to the kernel, including pointers, must retain their information when cast to long, because the Linux kernel syscall interface passes syscall arguments as an array of up to six longs. See include/asm-generic/syscall.h:syscall_get_arguments() in the Linux kernel. (While this function is implemented for each hardware architecture separately, every implementation has the same signature, ie. passes the syscall arguments as long.)

The C standard does not define any relationship between the pointer address, and the value of a pointer when converted to a sufficiently large integer type. This is because there historically were architectures where this relationship was complicated (for example, on 8086 'far' pointers were 32-bit, where the actual address was 16*high16bits + low16bits). In Linux, however, the relationship is expected to be 1:1. This can be seen in things like /proc pseudo-filesystem, where pointers (say, in /proc/self/maps) are displayed in hexadecimal. See lib/vsprintf.c:pointer_string(), which is used to convert pointers to strings for userspace ABIs: it casts the pointer to unsigned long int, and prints the number value.

This means that when pointer ptr is N-byte aligned, in Linux, (unsigned long)ptr % N == 0.

While the C standard leaves signed integer overflow for each implementation to define, the Linux kernel expects and uses the GCC behaviour: signed integers use two's complement, and wrap around analogously to unsigned integers. This means that casts between long and unsigned long do not affect the storage representation and lose no information; the types only differ in whether the value they represent is considered signed or unsigned. Thus, any of the logic above wrt. long equally applies to unsigned long, and vice versa.

Finally, you can variants paraphrasing the statement "on all currently supported architectures, ints are assumed to be 32 bits, longs the size of a pointer and long long 64 bits" in both the kernel sources and on the Linux Kernel Mailing List, LKML. See e.g. Christoph Hellwig in 2003. The patch (to documentation on adding syscalls) to explicitly mention that Linux currently only supports ILP32 and LP64 models was submitted in April 2021 with positive reactions, but hasn't been applied yet.

Upvotes: 1

Related Questions