kamakshi
kamakshi

Reputation: 39

Segmentation fault due to memory alignment in SSE

I am working on face detection, in which I am taking input as a .bmp file and detecting the face and drawing a rectangle on the face.

But when I am adding a function called "cvDetect" to detect the face i am getting some segmentation fault, in the following line of code-

_mm_store_ps(&c(y, 4.0*x), _mm_sub_ps(_mm_load_ps(a.data(y, 4.0*x)), _mm_load_ps(b.data(y, 4.0*x))));

While debugging I found that there is some memory alignment problem due to these functions. Can anyone help in solving this problem, the code is in C++ and I am using Linux.

Upvotes: 1

Views: 4705

Answers (4)

user1071136
user1071136

Reputation: 15725

It looks like a.data(r, c) is a call for an operator() on an object called a, which returns a reference to some memory.

This memory should have been either allocated using _mm_malloc or _aligned_malloc if you're using Visual Studio (and not new or malloc).

If the memory is not dynamically allocated, but is a field in some object, the field should be declared with an alignment attribute like the one specified in asveikau's reply.

Upvotes: 0

Paul R
Paul R

Reputation: 212979

A quick fix for now would be to use unaligned loads and stores, i.e.

_mm_storeu_ps(&c(y, 4.0*x), _mm_sub_ps(_mm_loadu_ps(a.data(y, 4.0*x)), _mm_loadu_ps(b.data(y, 4.0*x))));

There will be a performance hit, unless you are using Core i5/i7, but at least it will work correctly.

Ultimately though you need to look at ensuring that your data is always 16 byte aligned.

Upvotes: 1

asveikau
asveikau

Reputation: 40246

I don't really know anything about these SSE extensions, but it sounds like you are having trouble aligning your variables. To declare a particular alignment with a variable declaration requires non-portable extensions, varying by your compiler.

For GCC you'd declare your variable something like this:

// Declare a variable called 'a' of type __m128, aligned at 16 bytes.
__m128 a __attribute__((aligned (16)));

For Microsoft Visual C++ you'd do something like this:

__declspec(align(16)) __m128 a;

Upvotes: 2

Anycorn
Anycorn

Reputation: 51475

_ps functions require 16-byte aligned memory operands

Upvotes: 0

Related Questions