Reputation: 20818
I am working on a compiler and have a large set of flags. In most cases, my nodes will receive a very small number of flags (about 12 for the largest), but the total number of flags is rather large (over 50.) All the flags are integers defined in an enum:
enum flags_t
{
FLAG_ONE,
FLAG_TWO,
FLAG_THREE,
[...]
MAX_FLAG
};
I am thinking that using an std::map<flags_t, bool>
makes more sense because most of my nodes are likely to use 0, 1, or 2 flags and the number of nodes is really large (it can easily become tenth of thousands.)
// with a map we have to check the existing on a get to avoid creating
// useless entries in the map
bool node::get_flag(flags_t const f)
{
flag_iterator it(f_flags.find(f));
return it == f_flags.end() ? false : *it;
}
void node::set_flag(flags_t const f, bool const value)
{
f_flags[f] = value;
}
But I'm wondering whether std::vector<bool>
would not actually end up being more effective? Although at first sight this looks good:
bool node::get_flag(flags_t const f)
{
return f_flags[f];
}
void node::set_flag(flags_t const f, bool const value)
{
f_flags[f] = value;
}
The vector needs to be allocated (i.e. sized properly) on initialization or the get_flag() functions needs to test whether f is part of the vector:
bool node::get_flag(flags_t const f)
{
return f >= f_flags.size() ? false : f_flags[f];
}
The problem I can see with a resize() call is that we would allocate / free memory all the time, even if we end up never actually using the vector (most nodes don't need any flags!) So testing the limit when we do a get is probably a good trade off, but we also need to make sure that the vector is large enough on the set_flag() call... (in which case we'd probably allocate the whole set of flags at once to avoid reallocations.)
bool node::set_flag(flags_t const f, bool const value)
{
if(MAX_FLAG > f_flags.size())
{
f_flags.resize(MAX_FLAG);
}
f_flags[f] = value;
}
So... would std::vector
or std::map
be better? Or would possibly std::set
be even better? (I have not used std::set before...)
Upvotes: 1
Views: 545
Reputation: 155436
Both std::set
and std::map
are a suboptimal choice for flags because they allocate storage dynamically, causing unnecessary fragmentation.
A simple way to represent flags is by storing them in an integral type. An unsigned 64-bit type will provide room for 64 flags. This will be both space-efficient and CPU-efficient, and idiomatic C++ to boot. For example:
enum flag_code
{
FLAG_ONE = 1ULL << 0,
FLAG_TWO = 1ULL << 1,
FLAG_THREE = 1ULL << 2,
[...]
};
typedef uint64_t flags_t;
void node::set_flag(flag_code f, bool value)
{
if (value)
f_flags |= f;
else
f_flags &= ~f;
}
bool node::get_flag(flag_code f)
{
return bool(f_flags & f);
}
If more than 64 flags are needed, the bit manipulation is best left expressed with std::bitset
, which also offers array-like access to individual bits of the underlying value:
enum flag_code
{
FLAG_ONE,
FLAG_TWO,
FLAG_THREE,
[...]
MAX_FLAG
};
typedef std::bitset<MAX_FLAG - 1> flags_t;
void node::set_flag(flag_code f, bool value)
{
f_flags[f] = value;
}
bool node::get_flag(flag_code f)
{
return f_flags[f];
}
Upvotes: 4