user1290696
user1290696

Reputation: 509

C++11 and [17.5.2.1.3] Bitmask Types

The Standard allows one to choose between an integer type, an enum, and a std::bitset.

Why would a library implementor use one over the other given these choices?

Case in point, llvm's libcxx appears to use a combination of (at least) two of these implementation options:

ctype_base::mask is implemented using an integer type: <__locale>

regex_constants::syntax_option_type is implemented using an enum + overloaded operators: <regex>

The gcc project's libstdc++ uses all three:

ios_base::fmtflags is implemented using an enum + overloaded operators: <bits/ios_base.h>

regex_constants::syntax_option_type is implemented using an integer type, regex_constants::match_flag_type is implemented using a std::bitset
Both: <bits/regex_constants.h>

AFAIK, gdb cannot "detect" the bitfieldness of any of these three choices so there would not be a difference wrt enhanced debugging.

The enum solution and integer type solution should always use the same space. std::bitset does not seem to make the guarantee that sizeof(std::bitset<32>) == std::uint32_t so I don't see what is particularly appealing about std::bitset.

The enum solution seems slightly less type safe because the combinations of the masks does not generate an enumerator.

Strictly speaking, the aforementioned is with respect to n3376 and not FDIS (as I do not have access to FDIS).

Any available enlightenment in this area would be appreciated.

Upvotes: 7

Views: 2164

Answers (3)

Jonathan Wakely
Jonathan Wakely

Reputation: 171263

My preference is to use an enum, but there are sometimes valid reasons to use an integer. Usually ctype_base::mask interacts with the native OS headers, with a mapping from ctype_base::mask to the <ctype.h> implementation-defined constants such as _CTYPE_L and _CTYPE_U used for isupper and islower etc. Using an integer might make it easier to use ctype_base::mask directly with native OS APIs.

I don't know why libstdc++'s <regex> uses a std::bitset. When that code was committed I made a mental note to replace the integer types with an enumeration at some point, but <regex> is not a priority for me to work on.

Upvotes: 2

Potatoswatter
Potatoswatter

Reputation: 137800

The really surprising thing is that the standard restricts it to just three alternatives. Why shouldn't a class type be acceptable? Anyway…

  • Integral types are the simplest alternative, but they lack type safety. Very old legacy code will tend to use these as they are also the oldest.
  • Enumeration types are safe but cumbersome, and until C++11 they tended to be fixed to the size and range of int.
  • std::bitset may be have somewhat more type safety in that bitset<5> and bitset<6> are different types, and addition is disallowed, but otherwise is unsafe much like an integral type. This wouldn't be an issue if they had allowed types derived from std::bitset<N>.

Clearly enums are the ideal alternative, but experience has proven that the type safety is really unnecessary. So they threw implementers a bone and allowed them to take easier routes. The short answer, then, is that laziness leads implementers to choose int or bitset.

It is a little odd that types derived from bitset aren't allowed, but really that's a minor thing.

The main specification that clause provides is the set of operations defined over these types (i.e., the bitwise operators).

Upvotes: 2

Bo Persson
Bo Persson

Reputation: 92241

Why would the standard allow different ways of implementing the library? And the answer is: Why not?

As you have seen, all three options are obviously used in some implementations. The standard doesn't want to make existing implementations non-conforming, if that can be avoided.

One reason to use a bitset could be that its size fits better than an enum or an integer. Not all systems even have a std::uint32_t. Maybe a bitset<24> will work better there?

Upvotes: 0

Related Questions