Reputation: 9427
Knowing that Intel and AMD processors fetch instructions in their native word length (64-bit mainly nowadays), I asked my brother about it and he said that to get the processor to run more efficiently, some assembly programmers pad their instructions to 32 bits with nop
s if the next instruction will put the byte length at more than 4 or 8 bytes:
xor ax, ax ; 2 bytes
nop ; 1
nop ; 1
So is there any benefit to doing this?
Upvotes: 1
Views: 1282
Reputation: 12263
Yes, it can substantially increase performance on AMD Bulldozer and Intel Atom, and, to a lesser degree, on Intel Core 2 & Nehalem. For Bulldozer and Core 2 align on 16-byte boundary, for Atom on 8-byte boundary. However, it is preferably to use additional prefixes or longer instruction forms instead of NOPs. Note that aligning instructions only makes sense if you need more than half of peak IPC.
Upvotes: 2
Reputation: 4732
There is no reason for the nop instructions in your example. Generally, the only use for instruction alignment is to maximize the number of instructions fetched at the target of a control flow branch, e.g. a function call. Modern x86 fetch and decode units are well optimized for the variable length nature of x86 encoding. Padding like this only slows things down.
A scan of the Intel Volume 4 optimization manual (maybe a few years out-of-date) provided no reasons for instruction padding.
Upvotes: 4