Ivan Venkov3
Ivan Venkov3

Reputation: 25

Is there unaligned access problem in NASM?

I know what is the unaligned access in C and that it can cause for some processors UB.

I wonder if there is the same problem in code like this, written on NASM assembly:

    section .text
        global _start
_start:
        mov [arr], word "abcd"

        section .data
arr: db 1, 2, 3, 4, 5, 6, 7

Upvotes: 0

Views: 241

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 364220

Generally no problem, x86 allows unaligned accesses for any size (with some limitations for 16-byte unaligned).

Some other ISAs don't (e.g. SPARC, MIPS before MIPS32r6, etc.) and C caters to those by not defining the behaviour when a T* pointer has less than alignof(T) alignment. In GNU C you can use __attribute__((aligned(1))) to typedef types that have well-defined behaviour at any alignment.


The .data section will be aligned by at least 4 bytes by default under Linux, so a 2-byte (word) store to [arr] is an aligned store; the address is guaranteed to be even (unless you use special linker options / linker script to tell it to start .data on an odd address). Your arr starts at the start of your .data section.

Also, "abcd" is a 4-byte constant that will have to be truncated to fit in a word. I guess you missed that when you tested your example to see that it happened to work on your own computer, before asking if it was safe in general?

cause for some processors UB

No, it's always UB in ISO C. See Why does unaligned access to mmap'ed memory sometimes segfault on AMD64? for an example and links. Note that Undefined Behaviour doesn't mean it does crash, just that the optimizer can assume it doesn't happen and the results can be unpredictable.

The behaviour is always well-defined in x86, like for most ISAs. Hardware vendors have to specify exactly what happens even in cases that raise exceptions, so OSes can be written to maintain control of the machine when user-space causes faults. (So in asm, what you're really looking for isn't defined-behaviour, but guaranteed non-faulting.)

Any misalignment is fine for any access size other than 16 bytes. (Assuming the AC bit is cleared, which is the case in normal systems. glibc memcpy for example would fault if you set it, for small unaligned copies. Unless you specifically set AC yourself as a way to detect unintentional unaligned accesses, you can assume it's cleared. There are also performance counters for split-loads and split-stores on modern CPUs which you can use instead to detect problematic ones.)

For 16-byte accesses, legacy-SSE accesses require natural alignment by default (e.g. SSE2 pxor xmm0, [rdi] requires alignment), except for instructions like movdqu unaligned load/store. Other sizes like 8-byte don't require alignment, e.g. punpckldq mm0, [rdi] is alignment-safe because MMX registers are only 8 bytes wide, even though punpck instructions annoyingly do full-width loads instead of just the half that they shuffle in to the destination.)

With AVX / AVX-512 encodings (VEX / EVEX), unaligned is the default (e.g. vaddps xmm0, xmm1, [rdi] doesn't require alignment), and only special alignment-required instructions like vmovntps-stores or vmovdqa load/store will fault on misalignment.

The behaviour of alignment-required accesses is well-defined even for misaligned addresses: #GP fault for SSE/AVX misalignment, or #AC if you set the AC bit and did something that required 2, 4, or 8 bytes of alignment but didn't meet that requirement. (https://xem.github.io/minix86/manual/intel-x86-and-64-manual-vol3/o_fe12b1e2a880e0ce-231.html excerpts the relevant page of Intel's SDM PDFs.)

Under GNU/Linux, a user-space process will receive a SIGSEGV (segmentation fault) if it generates a #GF exception. IIRC, #AC might get the kernel to deliver a SIGBUS (bus error).


The only problems with unaligned access in x86 are performance

(Except as mentioned with legacy-SSE memory operands.)

Upvotes: 4

Related Questions