anatolyg
anatolyg

Reputation: 28269

How to define a 128-bit constant efficiently?

I am working with SSE2 instruction set in MS Visual Studio. I am using it to do some calculations with 16-bit data.

Suppose i have 8 values loaded into a SSE register. I want to add a constant (e.g. 42) to all of them. Here is how i would like my code to look.

__m128i values; // 8 values, 16 bits each
const __m128i my_const_42 = ???; // What should i write here?
values = _mm_add_epi16(values, my_const_2); // Add 42 to the 8 values

Now, how can i define the constant? The following two ways work, but one is inefficient, and the other is ugly.

  1. my_const_42 = _mm_set_epi16(42, 42, 42, 42, 42, 42, 42, 42) - compiler generates 8 commands to "build" the constant
  2. my_const_42 = {42, 0, 42, 0, 42, 0, 42, 0, 42, 0, 42, 0, 42, 0, 42, 0} - hard to understand what is going on; changing 42 to e.g. -42 is not trivial

Is there any way to express the 128-bit constant more conveniently?

Upvotes: 5

Views: 2114

Answers (2)

BitBank
BitBank

Reputation: 8715

Something to note about creating constants in SSE (or NEON). Loading data from memory is extremely slow compared to instruction execution. If you need a constant which is possible to create through code, then that's the faster choice. Here are some examples of constants created through code:

 xmmTemp = _mm_cmpeq_epi16(xmmA, xmmA); // FFFF
 xmmTemp = _mm_slli_epi16 (mmxTemp, 7); // now it has 0xFF80 (-128)

 xmmTemp = _mm_cmpeq_epi16(xmmA, xmmA); // FFFF
 xmmTemp = _mm_slli_epi16 (mmxTemp, 15); // 0x8000
 xmmTemp = _mm_srli_epi16 (mmxTemp, 11); // 0x10 (positive 16)

Upvotes: 3

Hans Passant
Hans Passant

Reputation: 941665

Ninety percent of the battle is finding the correct intrinsic. The MSDN Library is pretty well organized, start at this page. From there, drill down like this:

  • You know you want to use "MMX, SSE and SSE2 Intrinsics", click that link
  • You know you want to use "Streaming SIMD Extensions 2", click that link
  • Next attractive link is "Integer Memory and Initialization" since you don't want floating point
  • You'll get two relevant links, Load and Set Operations
  • Load just gets you the ones you already found

Set is golden, out pops _mm_set1_epi16 (short w)

Upvotes: 9

Related Questions