Reputation: 311
Is there a way of iterating over a (possibly huge) std::bitset
that is linear in the number of bits that are set to true? I want to prevent having to check every single position in the bitset. The iteration should successively return the indices of each bit that is set to true.
Upvotes: 31
Views: 13984
Reputation: 76660
There is a way to do it in near O(k) time [plus O(n/64)] using the c++ standard bit manipulation tools in C++20.
The O(n/64) is not a problem for me, but is may be an issue if you set is very large and very sparse.
These map to CPU intrinsics on all the major platforms (x64, ARM, etc).
On order for this to work we have to make a few assumptions.
1: The bitset is a standard size linked with the native register size. 64 bits is a good starting point. You store the bitsets in an array as shown below.
2: The number of set bits is known, or the length of the bitset is known. You could add length checking to the function, but this code assumes you know the number of set bits in advance (because you are tracking them in some additional list).
//Walk through an array of std::bitset<64>
//And return the index of the next set bit.
//No attempt is made to stay within the bounds of the array
//So you need to know how many bits are set in total.
template <bool satbit>
int NextSetBit(const std::bitset<64>* bits, const int previous = -1) {
//walking through the satisfied bits of a bitset using tzcnt is much faster than testing each single bit
//esp for bitset that have many 0 bits.
assert(nullptr != bits);
assert(previous >= -1);
//get followup bits
auto next = previous + 1;
auto chunk = (next / 64); //starting chunk
const auto firstmask = uint64_t(-1) << (next % 64); //mask off the previously investigated bits
const auto getnextchunk = [=](const int chunk) {
if constexpr (satbit) { return bits[chunk].to_ullong(); }
else { return ~(bits[chunk].to_ullong()); }
};
auto data = firstmask & getnextchunk(chunk);
while (0 == data) {
data = getnextchunk(++chunk);
}
next = std::countr_zero(data);
assert(bits[chunk].test(next) == satlit);
return (next + (chunk * 64));
}
The code is called as follows:
const auto length = LengthOf(bits);
auto next = -1;
for (auto i = 0; i < SetBitCount; i++) {
next = NextSetBit<true>(bits, next);
assert(next < length);
doStuff(next);
}
The n/64 overhead factor is only a problem if your set is very sparse. You can fix this by excluding empty chunks from the iteration, using a list that tracks chunks that have set bits in them. Such bookkeeping can easily be made O(1).
You are thus able to avoid empty chunks, making the code truly O(k) at the cost of some constant overhead.
Because such a registration needs to track the number of set bits per chunk anyway, you can use the code as is without having to add bounds checking.
Upvotes: 1
Reputation: 372972
A standard bitvector does not support efficient iteration over true bits - the runtime is always O(n), where n is the number of total bits, which has no dependence on k. However, there are specialized data structures like van Emde Boas trees and y-fast tries, that support iteration over the bits in time O(k lg lg n), where n is the number of bits and k is the number of true bits.
Upvotes: 19
Reputation: 5937
Sometimes people use run-length encoding for things like that. If you encode incoming bitset into an array of run lengths the number of runs you end up with wouldn't exceed the number of transitions between set and clear bits, which is at most 2*k
. Furthermore, in many applications the number of transitions is much less than k
, so you'd get excellent average time performance in addition to linear worst-case one.
Furthermore, it's straightforward to add a data structure that would let you efficiently search for things such as "the next set bit starting with n
th position in the array": just build a scan of run lengths.
Upvotes: 0
Reputation: 130
There are only two options that do much better than O(N) on total bits:
Upvotes: 1
Reputation: 12515
You can check up to 32-bits at a time with a u64 accumulator and a 32-entry table like
u32 kTable[]
{
0x01, 0x03, 0x07, 0x0F ..., 0xFFFFFFFF
};
Just read in 32 bits into a u64 accumulator and shift it down depending on the offset and check your bits against the table. You can do this in a binary fashion to make the number of compares at max 5. This will be slower for data that isn't 'linear' in fashion. THis then becomes log time.
Upvotes: 1
Reputation: 40877
Looping over the entire bitset and simply checking the value and storing the index if true, IS linear. You can speed that up though with a lookup table. See this code:
Upvotes: -1
Reputation: 106196
For that to be linear, you'd need a linked-list/array/set of the indices set true. Keeping such a secondary index is not part of the performance/storage tradeoffs required by std::bitset, and given it would disadvantage everyone without your specific requirement, there's no way an implementation will provide this. You could consider supplementing your bitset with such a container yourself, or using boost's multi-index container library.
Upvotes: 5