Reputation: 1312
i have a byte array. Now i need to know the count of appearances of a bit pattern which length is N.
For example, my byte array is "00100100 10010010" and the pattern is "001". here N=3, and the count is 5.
Dealing with bits is always my weak side.
Upvotes: 3
Views: 3260
Reputation: 7985
If N may be arbitrary large You can store the bit pattern in a vector
vector<unsigned char> pattern;
The size of the vector should be
(N + 7) / 8
Store the pattern shifted to the right. By this, I mean, that for example, if N == 19, Your vector should look like:
|<- v[0] ->|<- v[1] ->|<- v[2] ->|
0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1
| |<- pattern ->|
If You have Your pattern originally shifted to the left, You can use the function I'll present below, to shift the bits to the right.
Define a vector of bytes, of the same length as the pattern, to store a part of Your bit stream for comparing it with the pattern. I'll call it window
vector<unsigned char> window;
If N is not an integer multiple of 8, You will need to mask some leftmost bits in Your window
, when comparing it with the pattern. You can define the mask this way:
unsigned char mask = (1 << (N % 8)) - 1;
Now, assuming the window
contains the bits, it should, You could theoretically compare the pattern with the window
using vector's operator == like this
window[0] &= mask;
bool isMatch = (window == pattern);
But there are good reasons to be a little bit more sophisticated. If N is large and Your byte array, You look for the pattern in, is significantly larger, it's worth it, to process the pattern and build a vector of size N+1:
vector<int> shifts;
This vector will store the information, how many bits to shift the bit stream by, for the next comparison, based on the position, at which there is a mismatch in the current window
.
Consider the pattern 0001001100
. You should compare the bits with the window
from right to left. If there is a missmatch at the first bit, You know it's 1 and the first occurrence of 1 in Your pattern is at the position 2 counting form 0 form the right to the left. So in that case, You know, that it doesn't make sense to make a comparison if the number of new bits shifted form the bit stream into the window
is less than 2. Similarly if the mismatch occurs at the third bit (position 2 counting form 0), the window
should be moved by 7, because 3 consecutive zeros in your pattern are at the end. If the mismatch is at the position 4, You can move the window
by 8 and so on. The sifts
vector, at an index i
will hold number of bits, by which to move the window
, if the mismatch occurs at the position i
. If there is a match, the window
should be moved by the number of bits stored in shifts[N]
. In the example above, a match means a shift by 8.
In practice of course, You compare whole bytes form the pattern with the bytes from the window
(going form right to left) and if there is a mismatch You examine the bits in the byte to find the mismatch position.
if(window[i] != pattern[i])
{
int j = 0;
unsigned char mismatches = window[i] ^ pattern[i];
while((mismatches & 1) == 0)
{
mismatches >>= 1;
++j;
}
mismatch_position = 8 * (window.size() - i - 1) + j;
}
Here is a function that might come handy, when You need to shift some bits from Your bit stream into the window
. I wrote it in C#, but conversion to C++ should be trivial. C# makes some casts necessary, that are probably not necessary in C++. Use unsigned char
instead of byte
, vector<unsigned char> &
instead of byte []
, size()
instead of Length
and maybe some more minor tweaks. The function is probably a little more general than needed in Your scenario, as it doesn't use the fact, that consecutive calls retrieve consecutive chunks of Your byte array, which maybe could make it a bit simpler, but I don't think it hurts. In the current form, it can retrieve arbitrary bit substring form the byte array.
public static void shiftBitsIntoWindow_MSbFirst(byte[] window, byte[] source,
int startBitPosition, int numberOfBits)
{
int nob = numberOfBits / 8;
// number of full bytes from the source
int ntsh = numberOfBits % 8;
// number of bits, by which to shift the left part of the window,
// in the case, when numberOfBits is not an integer multiple of 8
int nfstbb = (8 - startBitPosition % 8);
// number Of bits from the start to the first byte boundary
// The value is from the range [1, 8], which comes handy,
// when checking if the substring of ntsh first bits
// crosses the byte boundary in the source, by evaluating
// the expression ntsh <= nfstbb.
int nfbbte = (startBitPosition + numberOfBits) % 8;
// number of bits from the last byte boundary to the end
int sbtci;
// index of the first byte in the source, from which to start
// copying nob bytes from the source
// The way in which the (sbtci) index is calculated depends on,
// whether nob < window.Length
if(nob < window.Length)// part of the window will be replaced
// with bits from the source, but some part will remain in the
// window, only moved to the beginning and possibly shifted
{
sbtci = (startBitPosition + ntsh) / 8;
//Loop below moves bits form the end of the window to the front
//making room for new bits that will come form the source
// In the corner case, when the number by which to shift (ntsh)
// is zero the expression (window[i + nob + 1] >> (8 - ntsh)) is
// zero and the loop just moves whole bytes
for(int i = 0; i < window.Length - nob - 1; ++i)
{
window[i] = (byte)((window[i + nob] << ntsh)
| (window[i + nob + 1] >> (8 - ntsh)));
}
// At this point, the left part of the window contains all the
// bytes that could be constructed solely from the bytes
// contained in the right part of the window. Next byte in the
// window may contain bits from up to 3 different bytes. One byte
// form the right edge of the window and one or two bytes form
// the source. If the substring of ntsh first bits crosses the
// byte boundary in the source it's two.
int si = startBitPosition / 8; // index of the byte in the source
// where the bit stream starts
byte byteSecondPart; // Temporary variable to store the bits,
// that come from the source, to combine them later with the bits
// form the right edge of the window
int mask = (1 << ntsh) - 1;
// the mask of the form 0 0 1 1 1 1 1 1
// |<- ntsh ->|
if(ntsh <= nfstbb)// the substring of ntsh first bits
// doesn't cross the byte boundary in the source
{
byteSecondPart = (byte)((source[si] >> (nfstbb - ntsh)) & mask);
}
else// the substring of ntsh first bits crosses the byte boundary
// in the source
{
byteSecondPart = (byte)(((source[si] << (ntsh - nfstbb))
| (source[si + 1] >> (8 - ntsh + nfstbb))) & mask);
}
// The bits that go into one byte, but come form two sources
// -the right edge of the window and the source, are combined below
window[window.Length - nob - 1] = (byte)((window[window.Length - 1] << ntsh)
| byteSecondPart);
// At this point nob whole bytes in the window need to be filled
// with remaining bits form the source. It's done by a common loop
// for both cases (nob < window.Length) and (nob >= window.Length)
}
else// !(nob < window.Length) - all bits of the window will be replaced
// with the bits from the source. In this case, only the appropriate
// variables are set and the copying is done by the loop common for both
// cases
{
sbtci = (startBitPosition + numberOfBits) / 8 - window.Length;
nob = window.Length;
}
if(nfbbte > 0)// The bit substring coppied into one byte in the
// window crosses byte boundary in the source, so it has to be
// combined form the bits, commming form two consecutive bytes
// in the source
{
for(int i = 0; i < nob; ++i)
{
window[window.Length - nob + i] = (byte)((source[sbtci + i] << nfbbte)
| (source[sbtci + 1 + i] >> (8 - nfbbte)));
}
}
else// The bit substring coppied into one byte in the window
// doesn't cross byte boundary in the source, so whole bytes
// are simply coppied
{
for(int i = 0; i < nob; ++i)
{
window[window.Length - nob + i] = source[sbtci + i];
}
}
}
Upvotes: 1
Reputation: 179787
Convert your byte array and pattern each to a std::vector<bool>
, then call std::search(source.begin(), source.end(), pattern.begin(), pattern.end());
. Despite vector<bool>
s idiosyncracies, this will work.
Upvotes: 0
Reputation: 14112
Assuming your array fits into an unsigned int:
int main () {
unsigned int curnum;
unsigned int num = 0x2492;
unsigned int pattern = 0x1;
unsigned int i;
unsigned int mask = 0;
unsigned int n = 3;
unsigned int count = 0;
for (i = 0; i < n; i++) {
mask |= 1 << i;
}
for (i = 8 * sizeof(num) - n; i >= 0; i--) {
curnum = (num >> i) & mask;
if (! (curnum ^ pattern)) {
count++;
}
}
}
Upvotes: 0
Reputation: 6317
You could always XOR the first N bits and if you get 0 as a result you have a match. Then shift the searched bit "stream" one bit to the left and repeat. That is assuming you want to get matches if those sub-patterns overlap. Otherwise you should shift by pattern length on match.
Upvotes: 7