Reputation: 567
I'm trying to move ALIGNED float array into xmm register
#define ALIGNED16 __declspec(align(16))
ALIGNED16 float vector1[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
ALIGNED16 float vector2[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
ALIGNED16 float result[4];
_add_vector(vector1, vector2, result);
....
_add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec)
{
__asm
{
movaps xmm0, xmmword ptr [v1]
movaps xmm1, xmmword ptr [v2]
addps xmm0, xmm1
movaps xmmword ptr [rvec], xmm0
};
}
so when compiler trying to copy from v1 to xmm0 I have "read access violation" v1 was0xFFFFFFFF
But if I'm doing
__asm
{
movaps xmm0, xmmword ptr [v1]
};
AFTER vector1 declaration then it works. why?
Upvotes: 1
Views: 765
Reputation: 47573
The issue is that v1
, v2
, and vrec
are pointers to an array of floats. You need to dereference each of those pointers to get the actual arrays. Something like this may work:
void _add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec);
void _add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec)
{
__asm
{
mov ecx, [v1]
mov edx, [v2]
mov eax, [rvec]
movaps xmm0, xmmword ptr [ecx]
movaps xmm1, xmmword ptr [edx]
addps xmm0, xmm1
movaps xmmword ptr [eax], xmm0
};
}
In this case I use the caller saved registers of EAX, ECX, and EDX to do dereference the variables.
Upvotes: 2