Reputation: 411
When working with Project Euler problems I often need large (> 10**7) bit array's.
My normal approach is one of:
bool* sieve = new bool[N];
bool sieve[N];
When N = 1,000,000 my program uses 1 MegaByte (8 * 1,000,000 bits).
Is there a more efficient way to use store bit arrays than bool in c++?
Upvotes: 26
Views: 77970
Reputation: 31
A 'bool' type isn't stored using only 1 bit. From your comment about the size, it seems to use 1 entire byte for each bool.
A C like way of doing this would be:
uint8_t sieve[N/8]; //array of N/8 bytes
element of array is:
result = sieve[index / 8] || (1 << (index % 8));
or
result = sieve[index >> 3] || (1 << (index & 7));
set 1 in array:
sieve[index >> 3] |= 1 << (index & 7);
Upvotes: 3
Reputation: 347
You might be interested in trying the BITSCAN library as an alternative. Recently an extension has been proposed for sparseness, which I am not sure is your case, but might be.
Upvotes: 2
Reputation: 92854
Use std::bitset
(if N
is a constant) otherwise use std::vector<bool>
as others have mentioned (but dont forget reading this excellent article by Herb Sutter)
A bitset is a special container class that is designed to store bits (elements with only two possible values: 0 or 1, true or false, ...).
The class is very similar to a regular array, but optimizing for space allocation: each element occupies only one bit (which is eight times less than the smallest elemental type in C++: char).
EDIT:
Herb Sutter (in that article) mentions that
The reason std::vector< bool > is nonconforming is that it pulls tricks under the covers in an attempt to optimize for space: Instead of storing a full char or int for every bool[1] (taking up at least 8 times the space, on platforms with 8-bit chars), it packs the bools and stores them as individual bits(inside, say, chars) in its internal representation.
std::vector < bool > forces a specific optimization on all users by enshrining it in the standard. That's not a good idea; different users have different requirements, and now all users of vector must pay the performance penalty even if they don't want or need the space savings.
EDIT 2:
And if you have used Boost you can use boost::dynamic_bitset
(if N
is known at runtime)
Upvotes: 34
Reputation: 490048
You could look up std::bitset
and std::vector<bool>
. The latter is often recommended against, because despite the vector
in the name, it doesn't really act like a vector of any other kind of object, and in fact doesn't meet the requirements for a container in general. Nonetheless, it can be pretty useful.
OTOH, nothing is going to (at least dependably) store 1 million bool values in less than 1 million bits. It simply can't be done with any certainty. If your bit sets contain a degree of redundancy, there are various compression schemes that might be effective (e.g., LZ*, Huffman, arithmetic) but without some knowledge of the contents, it's impossible to say they would be for certain. Either of these will, however, normally store each bool/bit in only one bit of storage (plus a little overhead for bookkeeping -- but that's usually a constant, and on the order of bytes to tens of bytes at most).
Upvotes: 5
Reputation: 9130
If N is known at compile time, use std::bitset, otherwise use boost::dynamic_bitset.
Upvotes: 0
Reputation: 935
A 'bool' type isn't stored using only 1 bit. From your comment about the size, it seems to use 1 entire byte for each bool.
A C like way of doing this would be:
uint8_t sieve[N/8]; //array of N/8 bytes
and then logical OR bytes together to get all your bits:
sieve[0] = 0x01 | 0x02; //this would turn on the first two bits
In that example, 0x01 and 0x02 are hexadecimal numbers that represent bytes.
Upvotes: 3
Reputation: 503785
For better or for worse, std::vector<bool>
will use bits instead of bool's, to save space. So just use std::vector
like you should have been in the first place.
If N
is a constant, you can use std::bitset
.
Upvotes: 15
Reputation: 164281
You can use a byte array and index into that. Index n
would be in byte index n/8
, bit # n%8
. (In case std::bitset is not available for some reason).
Upvotes: 1