Caribou
Caribou

Reputation: 2081

The effects of data (mis)alignment

I'm using x86 intel machine, and windows 7, plus Visual C++ (versions 2005/2012 express)

I have been playing with alignment (I have just been doing this as a learning exercise.) Certainly I understand the implications on the size of the class/struct in terms of padding. I believe that I understand that it is also better aligned because of the way that CPU instructions work and expect data.

I have been looking at many different resources generally, for example (interesting) c++ data alignment /member order & inheritance (and other links like wikipedia) http://en.wikipedia.org/wiki/Data_structure_alignment

One area that can be affected (I read) seems to be performance, due to the need for data to be specific sizes for registers, misaligned data can cause issues (see wikipedia).

I wrote some code in which I created 3 structs, all with the same members with packing set to 1 , normal alignment, and with the members rearranged. This gave me objects with sizeof 8, 10 and 12. I ran code similar to the following for each :

struct MixedData1
{
    char Data1;
    short Data2;
    int Data3;
    char Data4;

    void operator() (MixedData1& md)
    {
        md.Data1 = 'a';
        md.Data2 = 1024;
        md.Data3 = 1000000;
        md.Data4 = 'b';
    }
};

typedef std::vector<MixedData1> MDVector;


int main(int argc, char* argv[])
{
    MixedData1 md;
    for(int count = 0; count < 10 ; count++)
    {    
        {
        std::cout << sizeof(md) << std::endl;
        boost::timer::auto_cpu_timer t;
        MDVector mdv(10000000); 
        std::fill(mdv.begin(),mdv.end(),md );
        std::for_each(mdv.begin(),mdv.end(),md);
        }
    }
}

I'm not really interested in the values so each element in the vector is initialised the same. Anyway I got results that indicated that the running time increased with the size of the struct - I.E with pack(1) (8 bytes) I got the quickest 0.08s, and with normal alignment (12 bytes) I got the slowest 0.105 .

My questions are about the effects of being wrongly aligned. I don't think I have ever had any issues with alignment throughout my X years as a C++ programmer, but of course it could have just passed me by.

(1) The alignment had an effect (I believe) in my test (edit) however as Neil posted it was only due to the difference in struct size. I tried accessing the member as per his reply but I saw no real effect there.... is there a clearer example? Is there a way I can see a dramatic effect of misalignment? (2) Is there a way to induce a crash caused by misalignment if possible.

Upvotes: 2

Views: 1955

Answers (2)

J&#248;rgen Fogh
J&#248;rgen Fogh

Reputation: 7656

The short answer: It doesn't matter in practice.

Here's why: 1 or 2 cache-misses is likely to take less than a millisecond, so accessing unaligned data will only be a problem if:

  1. The data straddles two cache lines
  2. You access many unaligned pieces of data, which aren't contiguous in memory.

Since 2. will generate large amounts of cache misses anyway, you shouldn't be in that situation even if the data is aligned. Improving alignment would improve the number of cache misses by no more than 2x but storing data contiguously could improve the performance by many times.

There are some instructions which require data to be aligned. If you need these instructions, you will either know about it or your compiler should ensure the alignment for you. Whether this affects performance depends on your processor's microarchitecture and the compiler. In any case, you should start by profiling your program to find the bottleneck. If alignment significantly affects the performance of your program then fix it. Otherwise don't worry about it.

Upvotes: 1

Neil
Neil

Reputation: 55402

All your code does is to test to see how quickly the processor can copy memory. The more memory, the slower the copy. The alignment of the individual members within the structure is irrelevant to the speed of the copy, only the size of the structure matters.

If you want to see the effect of the alignment, you need to write code that actually access individual unaligned structure members. For instance, you could write a loop to increment the data3 members of each structure. Depending on the architecture the compiler may realise that it has to use different instructions to perform the arithmetic; on x86 this is usually not the case and the compiler will emit natural looking code because the processor is capable with dealing with unaligned accesses. Some processors can actually read and write unaligned data at the same speed as aligned data. A trivial example of this is the 8088 as it only has an 8-bit data bus so all 16-bit instructions are emulated using two loads anyway, but the latest processors spend most of their time reading from cache lines and so the only time unaligned data might make a difference is when the data crosses a cache line.

If you want to induce a crash by misalignment then normally you need to cast pointers between different types. The compiler then may not always realise that your pointer may be misaligned and will not generate the correct instructions for a misaligned access. For instance you could attempt to invoke an SSE instruction on a cast char* pointer.

Upvotes: 4

Related Questions