Eric Stotch
Eric Stotch

Reputation: 241

Is there a shift 128/256 bits by 1 instruction?

I think I found my solution by rewriting my code to shift before movemask epi8 however it didn't look like I could shift a 128/256 value by 1 bit. Is that true? searching sr and looking at the 128 bit instructions show shifts that do a *8 on them https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=789,5534,5534&techs=SSE2&cats=Shift&text=sr

What I was originally intending to use this for was shifting an array right 1 bit and anding them but I guess I need to do that before movemask. I thought it was strange I can't do a 128/256bit shift by 1

Upvotes: 2

Views: 694

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 365517

vpmovmskb only cares about the top bit, so if you can handle getting the bits out in the opposite order you can shift left.

e.g. vpaddb, although element size doesn't matter because it's fine if bits go across byte boundaries, as long as they don't get to the MSB of the next byte. So you can use vpslld ymm, ymm, 4 or something to start a 2nd dependency chain instead of one chain of 7x vpaddb. Also, that gives you a uop that might be able to run on a different port than vpaddb/w/d, on some CPUs where vpadd* can't run on every vector-ALU port like it can on Skylake.

Correct that you can't easily shift right by 1 bit across the 64-bit element boundaries.

XMM/YMM registers are SIMD vectors, not 128-bit-integer. The widest chunk-size for bit-level stuff is 64 bits, in SSE/AVX/AVX-512. Beyond that it's byte granularity at the smallest for whole-vector shuffles.

Upvotes: 2

Related Questions