Reputation: 27
Can anyone explain the difference between an aligned and an unaligned data transfer
Upvotes: 0
Views: 15288
Reputation: 71536
It is not limited to AXI busses; it is a general term which affects the bus transfers and leaves undesirable results (performance hits). But that depends heavily on the overall architecture.
If addresses are in units of bytes, byte addressable, then a byte is always aligned. Assuming a byte is 8 bits, then a 16 bit transfer would be aligned if it is on a 16 bit boundary, meaning the lower address bit is a zero. A 16 bit value which covers the addresses 0x1000 and 0x1001 is aligned, and is considered to be at address 0x1000 (big or little endian). But a 16 bit value that covers the addresses 0x1001 and 0x1002 is not aligned, it is considered to be at address 0x1001. 0x1002 and 0x1003 would be aligned. 32 bit value two lower address bits need to be zero to be aligned. A 32 bit value at 0x1000 is aligned but 0x1001, 0x1002, 0x1003 would all be unaligned.
Memories are generally not 8 bits wide from an interface perspective as well as a geometry, depends on what kind of memory or where. The cache in a processor that stages the transfers to slow dram, is going to likely be 32 or 64 or wider, some power of 2 or a power of 2 with a parity bit or ecc (32, 33 bits or 40) all of this is hidden from you other than performance hits you may run into. When you have a memory that is 32 bits wide and if I call a 32 bit value a word then that memory is word addressable the address 0x123 is a word address, its equivalent byte address is 0x123*4 or 0x48C. If you were to write a 32 bit value to byte address 0x48c that becomes a single word write to that memory at that memories address 0x123. But if you were to do a word write to byte address 0x48E, then you would need to do a read of word address 0x123 in that sram/memory replace two of the bytes from the word you are writing. and write that modified word back, then you would have to read from word address 0x124, modify two bytes and write the modified word back.
Various busses work various ways. some will put the single word on a word sized bus and allow unaligned addresses. a 32 bit wide axi would need to turn that 0x48E word write into two axi transfers one with two byte lanes enabled in the byte mask and the second transfer with the other two byte lanes enabled. A 64 bit wide axi bus; let's see.... 10010001110... would need to do two axi transfers, one transfer with 16 bits of the data, and a second one with the other 16 bits of the data because of where that 32 bits lands. But a word transfer at address 0x1001 would/should be a single transfer on a 64 bit axi bus with the middle four byte lanes enabled.
Other bus schemes work like this and some don't some will let a 32 bit thing fit in the 32 or 64 bit bus, but the memory controller on the other end has to do the multiple transactions to cache or create multiple transactions on the next bus.
Although technically possible to byte address dram as far as some of the standard parts and busses work, another thing a cache buys you is that the smaller and unaligned transactions can hit the faster sram, but the cache line reads and evictions can be optimized for the next bus or the external memory so dram for example for most of the systems we use can always be accessed aligned in multiples of the bus width (64 or 64+ecc) for desktops and servers and 32 or 16 bit for embedded systems, laptops, phones. The two busses and solutions can be optimized for each side with the cache being the translator.
Upvotes: 3