jaynp
jaynp

Reputation: 3325

Loading non-contiguous floats using SSE

Is there an Intel SSE instruction which can load floats from (non contiguous) evenly spaced memory addresses?

For example given an array A = {0, 1, 2, 3 .... n}, I would like to load into a 128 bit register at once {A[0], A[4], A[8], A[12]}, followed by {A[5], A[9], A[13], A[17]}

Upvotes: 2

Views: 461

Answers (1)

Paul R
Paul R

Reputation: 212969

In this kind of use case you would typically load multiple contiguous vectors and then permute them into the required arrangements using e.g. pshufd or punpckldq etc.

Note that with AVX2 in Haswell and beyond there are gathered load instructions (e.g. _mm_i32gather_ps), which might also be worth considering.

Upvotes: 3

Related Questions