Reputation: 5080
My apologies if this has been asked/answered before but I'm honestly not even sure how to word this as a question properly. I have the following bit pattern:
0110110110110110110110110110110110110110110110110110110110110110
I'm trying to perform a shift that'll preserve my underlying pattern; my first instinct was to use right rotation ((x >> count) | (x << (-count & 63)))
but the asymmetry in my bit pattern results in:
0011011011011011011011011011011011011011011011011011011011011011
<--- wrong
The problem is that the most significant (far left) bit ends up being 0 instead of the desired 1:
1011011011011011011011011011011011011011011011011011011011011011
<--- right
Is there a colloquial name for this function I'm looking for? If not, how could I go about implementing this idea?
Additional Information:
n - 1
ones (where n
is an odd number) and then repeats infinitely.Upvotes: 2
Views: 281
Reputation: 64904
The problem was changed a bit through the comments.
For all reasonable n
, the following problem can be solved efficiently after minimal pre-computation:
Given an offset
k
, get 64 bits starting at that position in the stream of bits that follows the pattern of (zero,n-1
ones) repeating.
Clearly the pattern repeats with a period of n
, so only n
different ulong
s have to be produced for every given value of n
. That could either be done explicitly, constructing all of them in pre-processing (they could be constructed in any obvious way, it doesn't really matter since that only happens once), or left more implicitly by storing only two ulongs per value for n
(this works under the assumption that n < 64
, see below) and then extracting a range from them with some shifting/ORing. Either way, use offset % n
to compute which pattern to retrieve (since the offset is increasing in a predictable manner, no actual modulo operation is required[1]).
Even with the first method, memory consumption will be reasonable since this optimization is only an optimization for low n
: in particular for n > 64
there will be fewer than 1 zero per word on average, so the "old fashioned way" of visiting every multiple of n
and resetting that bit starts to skip work while the above trick would still visit every word and would not be able anymore to reset multiple bits at once.
[1]: if there are multiple n
's in play at the same time, a possible strategy is keeping an array offsets
where offsets[n] = offset % n
, which could be updated according to: (not tested)
int next = offsets[n] + _64modn[n]; // 64 % n precomputed
offsets[n] = next - (((n - next - 1) >> 31) & n);
The idea being that n
is subtracted whenever next >= n
. Only one subtraction is needed since the offset and thing added to the offset are already reduced modulo n
.
This offset-increment can be done with System.Numerics.Vectors, which is very feature-poor compared to actual hardware but is just about able to do this. It can't do the shift (yes, it's weird) but it can implement a comparison in a branchless way.
Doing one pass per value of n
is easier, but touches lots of memory in a cache unfriendly manner. Doing lots of different n
at the same time may not be great either. I guess you'd just have to bechmark that..
Also you could consider hard-coding it for some low numbers, something like offset % 3
is fairly efficient (unlike offset % variable
). This does take manual loop-unrolling which is a bit annoying, but it's actually simpler, just big in terms of lines of code.
Upvotes: 1
Reputation: 283664
You've got a number structured like this:
B16 B15 B14 B13 B12 B11 B10 B09 B08 B07 B06 B05 B04 B03 B02 B01 B00
? 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0
The ?
needs to appear in the MSB (B15, or B63, or whatever) after the shift. Where does it come from? Well, the closest copy is found n
places to the right:
B13 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0
^--------------/
If your word has width w
, this is 1 << (w-n)
*
So you can do:
var selector = 1 << (w-n);
var rotated = (val >> 1) | ((val & selector) << (n-1));
But you may want a multiple shift. Then we need to build a wider mask:
? 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0
* * * * *
Here I've chosen to pretend n = 6
, it just needs to be a multiple of the basic n
, and larger than shift
. Now:
var selector = ((1UL << shift) - 1) << (w - n);
var rotated = (val >> shift) | ((val & selector) << (n - shift));
Working demonstration using your pattern: http://rextester.com/UWYSW47054
It's easy to see that the output has period 3, as required:
1:B6DB6DB6DB6DB6DB 2:DB6DB6DB6DB6DB6D 3:6DB6DB6DB6DB6DB6 4:B6DB6DB6DB6DB6DB 5:DB6DB6DB6DB6DB6D 6:6DB6DB6DB6DB6DB6 7:B6DB6DB6DB6DB6DB 8:DB6DB6DB6DB6DB6D 9:6DB6DB6DB6DB6DB6 10:B6DB6DB6DB6DB6DB 11:DB6DB6DB6DB6DB6D
Upvotes: 2
Reputation: 14379
Branchless Answer after a poke by @BenVoigt:
Original Answer:
Upvotes: 1
Reputation: 112362
Instead of storing a lot of repetitions of a pattern, just store one recurrence and apply modulo operations on the indexes
byte[] pattern = new byte[] { 0, 1, 1 };
// Get a "bit" at index "i", shifted right by "shift"
byte bit = pattern[(i - shift + 1000000 * byte.Length) % byte.Length];
The + 1000000 * byte.Length
must be greater than the greatest expected shift and ensures that we get a posistive sum.
This allows you to store patterns of virtually any length.
An optimization would be to store a mirrored version of the pattern. You could then shift left instead of right. This would simplify the index calculation
byte bit = pattern[(i + shift) % byte.Length];
Upvotes: 1