Alatriste
Alatriste

Reputation: 567

function parameter to xmm0

I'm trying to move ALIGNED float array into xmm register

#define ALIGNED16 __declspec(align(16))

ALIGNED16 float vector1[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
ALIGNED16 float vector2[4] = { 1.0f, 2.0f, 3.0f, 4.0f };
ALIGNED16 float result[4];

_add_vector(vector1, vector2, result);
....

_add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec)
{
  __asm
  {
    movaps xmm0, xmmword ptr [v1]
    movaps xmm1, xmmword ptr [v2]

    addps xmm0, xmm1

    movaps xmmword ptr [rvec], xmm0
  };
}

so when compiler trying to copy from v1 to xmm0 I have "read access violation" v1 was0xFFFFFFFF

But if I'm doing

__asm
  {
    movaps xmm0, xmmword ptr [v1]
  };

AFTER vector1 declaration then it works. why?

Upvotes: 1

Views: 765

Answers (1)

Michael Petch
Michael Petch

Reputation: 47573

The issue is that v1, v2, and vrec are pointers to an array of floats. You need to dereference each of those pointers to get the actual arrays. Something like this may work:

void _add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec);

void _add_vector(float *__restrict v1, float * __restrict v2, float * __restrict rvec)
{
    __asm
    {
        mov ecx, [v1]
        mov edx, [v2]
        mov eax, [rvec]

        movaps xmm0, xmmword ptr [ecx]
        movaps xmm1, xmmword ptr [edx]

        addps xmm0, xmm1

        movaps xmmword ptr [eax], xmm0
    };
}

In this case I use the caller saved registers of EAX, ECX, and EDX to do dereference the variables.

Upvotes: 2

Related Questions