Kris R
Kris R

Reputation: 13

SSE segfault on _mm_store_si128

I have a segfault when I try to load in some ciphertext I have generated using intrinsics. I do not understand the error at all. Code sample:

unsigned char c[177]; 
unsigned char m[161];
auth = _mm_setzero_si128();
unsigned char M_star[BLOCKSIZE];
__m128i tag = auth; 
for(i=0;i<numblocks_mes;++i)
{
    M = _mm_load_si128(m+i*BLOCKSIZE);
    idx = _mm_set_epi64x(zero,i); // both inputs are unsigned long long
    tmp = encrypt_block(M,idx);
    tag = _mm_xor_si128(tag,tmp);
}
if(fin_mes) 
{
    memcpy(M_star,m+numblocks_mes*BLOCKSIZE,fin_mes);
    A_star[fin_mes] = 0x80;
    memset(M_star+fin_mes+1,0,BLOCKSIZE-(fin_mes+1));
    M = _mm_load_si128(M_star);
    idx = _mm_set_epi64x(tag_fin,numblocks_mes); // both inputs are unsigned long long
    tmp = encrypt_block(M,idx); // Contains calls to AES
    tag = _mm_xor_si128(tag,tmp);
}
// print128_asint(tag);
tag = encrypt_block(tag,nonce);
// Following line causes segfault
_mm_store_si128( (__m128i *)&c[numblocks_mes*BLOCKSIZE+fin_mes], tag ); // numblocks_mes*BLOCKSIZE+fin_mes = 161

I have tried looking through other similar questions before, and tried them out, but I haven't found anything that worked for me.

Upvotes: 1

Views: 520

Answers (1)

Paul R
Paul R

Reputation: 213180

The destination address needs to be 16 byte aligned. Since c[] itself has no particular alignment then there are no guarantees about addresses at arbitrary offsets within c either (even if those offsets are multiples of 16).

Solution: use _mm_storeu_si128 instead of _mm_store_si128.


Note that you also appear to have been lucky with loads from m and Mstar - you should almost certainly change these to use _mm_loadu_si128.

Upvotes: 2

Related Questions