Reputation: 39
I am working on face detection, in which I am taking input as a .bmp file and detecting the face and drawing a rectangle on the face.
But when I am adding a function called "cvDetect" to detect the face i am getting some segmentation fault, in the following line of code-
_mm_store_ps(&c(y, 4.0*x), _mm_sub_ps(_mm_load_ps(a.data(y, 4.0*x)), _mm_load_ps(b.data(y, 4.0*x))));
While debugging I found that there is some memory alignment problem due to these functions. Can anyone help in solving this problem, the code is in C++ and I am using Linux.
Upvotes: 1
Views: 4705
Reputation: 15725
It looks like a.data(r, c) is a call for an operator() on an object called a, which returns a reference to some memory.
This memory should have been either allocated using _mm_malloc or _aligned_malloc if you're using Visual Studio (and not new or malloc).
If the memory is not dynamically allocated, but is a field in some object, the field should be declared with an alignment attribute like the one specified in asveikau's reply.
Upvotes: 0
Reputation: 212979
A quick fix for now would be to use unaligned loads and stores, i.e.
_mm_storeu_ps(&c(y, 4.0*x), _mm_sub_ps(_mm_loadu_ps(a.data(y, 4.0*x)), _mm_loadu_ps(b.data(y, 4.0*x))));
There will be a performance hit, unless you are using Core i5/i7, but at least it will work correctly.
Ultimately though you need to look at ensuring that your data is always 16 byte aligned.
Upvotes: 1
Reputation: 40246
I don't really know anything about these SSE extensions, but it sounds like you are having trouble aligning your variables. To declare a particular alignment with a variable declaration requires non-portable extensions, varying by your compiler.
For GCC you'd declare your variable something like this:
// Declare a variable called 'a' of type __m128, aligned at 16 bytes.
__m128 a __attribute__((aligned (16)));
For Microsoft Visual C++ you'd do something like this:
__declspec(align(16)) __m128 a;
Upvotes: 2