Reputation: 12864
Does the following program have undefined behavior?
#include <iostream> // std::{ostream, streambuf}
// The streambuf ctor is protected so we need a wrapper to create one.
struct mystreambuf : public std::streambuf {};
extern mystreambuf sb; // Not yet constructed.
std::ostream os(&sb); // Passing "invalid" pointer here? UB?
mystreambuf sb; // Now it is constructed.
int main() { return 0; }
It invokes the ostream
constructor, passing a pointer to a
streambuf
object whose lifetime has not yet begun
(basic.life p1).
Does this constitute undefined behavior?
If streambuf
were a user-written class, then
class.cdtor p1
would govern, which says:
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. [...]
This language, and its accompanying example, make it clear that merely
taking the address of an unconstructed object is not undefined. As far
I can tell, passing that address as a pointer to a user-written function
that only stores its value and tests it against nullptr
is also not
undefined.
But streambuf
is a library class, so instead
res.on.arguments p1
applies, which says, in part:
If an argument to a function has an invalid value (such as a value outside the domain of the function or a pointer invalid for its intended use), the behavior is undefined.
But what constitutes an "invalid value"? Presumably we have to determine the "intended use" by reading the specification of the called function. The constructor spec ostream.cons p1 says in part:
Effects: Initializes the base class subobject with
basic_ios<charT, traits>::init(sb)
([basic.ios.cons]).
The spec for init
basic.ios.cons p4 says:
Postconditions: The postconditions of this function are indicated in Table 127.
where Table 127 has two rows that mention sb
:
Element Value
------- -----
rdbuf() sb
rdstate() goodbit if sb is not a null pointer, otherwise badbit.
So, at first glance, this would seem to suggest that sb
is only stored
(so that rdbuf()
can return it) and tested for being nullptr
; and
that these together comprise its "intended use". Since both of these
would be legal for user-written code to do, it is legal to pass the
pointer in question, so the program has defined behavior.
But Table 127 is merely a list of postconditions. It does not
definitively assert that nothing else is in the scope of "intended use".
For that, it would seem necessary to exhaustively review everything that
basic_ostream
and its subclasses potentially do with sb
.
While attempting to do so, I find imbue
at
basic.ios.members p9:
Effects: Calls
ios_base::imbue(loc)
and ifrdbuf() != 0
thenrdbuf()->pubimbue(loc)
.
Clearly, calling rdbuf()->pubimbue(loc)
before the object pointed to
by rdbuf()
is constructed is undefined. Do we call imbue
? Not
explicitly of course, and there's no particular reason to suspect an
indirect call either, but the existence of this behavior arguably puts
it in scope of the "intended use" of the pointer passed to the
constructor, since eventually it could be used this way. Furthermore,
would it necessarily be non-conforming for an implementation to call
imbue
on its own during the ostream
constructor? I don't see why it
would be, and if an implementation is free to call imbue
in the
constructor, then clearly we have undefined behavior. And there could
be other methods that suggest other usages, as my survey was by no means
complete.
Now, in a comment on an
answer
to a related question, indi observes that the Clang implementation of
std::basic_fstream
does pass a pointer to an unconstructed member
object to the iostream
constructor at
fstream:1419:
basic_filebuf<char_type, traits_type> __sb_;
};
template <class _CharT, class _Traits>
inline basic_fstream<_CharT, _Traits>::basic_fstream() : basic_iostream<char_type, traits_type>(&__sb_) {}
But this example is not definitive because (1) it could be a mistake,
and (2) the library implementation is generally allowed to do things
that would be undefined in user code. Nevertheless, it is at least weak
evidence that the Clang developers think the practice does not have
undefined behavior, as they have no reason in this case to write code
that relies on the library's license to bend the rules, since it would
be a trivial change to instead pass nullptr
to the constructor and
then in the body call init
with the address of the (now fully
constructed) member object.
Ultimately, it seems to me that the language specification is ambiguous, as it relies on the terms "invalid value" and "intended use" which are not clearly specified. But perhaps someone can identify a provision I have missed or an error in my interpretations.
While researching this, I came across some existing questions that seemed related. The question How to inherit from std::ostream? has three relevant answers:
The (highest-voted)
answer by Ben was
specifically edited to avoid the potential problem by ensuring the
streambuf
is constructed before passing its address.
A more recent
answer by mach6 also
goes out of its way to avoid passing the unconstructed object's
pointer, this time by initializing the ostream
with nullptr
(albeit by using a non-standard constructor that only GNU libc++ has,
but is easily replaced with a standard one) and then calling init
afterward.
The answer by Henrik Heino passes the not-yet-constructed pointer. But this answer does not claim to be correct, and has one comment that says passing the pointer that way is incorrect.
From these answers and comments, I infer that quite a few knowledgeable people believe that the example at the top of this question has undefined behavior.
Meanwhile, the question
Is it dangerous to pass a pointer to a subobject that is not constructed yet to a constructor of another subobject during the object construction?
is very nearly the same as mine, but is marred by having some important
parts of the example code missing, and involves an extraneous
AnotherClass
that further muddies the question. The
answer by aschepler
seems to say that the practice is ok in general, but not in the OP's
case because of AnotherClass
, but it only reasons as if all of the
code were written by the user, ignoring the library aspect.
Finally, the question Is it safe to pass an unconstructed buffer to the constructor of std::ostream? is essentially the same as mine--I'm asking a duplicate! Why? In short, that question has no answers, and I think the additional research in my question makes it more likely mine can be answered, so I'm effectively submitting this with the intention of replacing that one. I asked a meta question about whether asking this duplicate is acceptable, and the consensus seems to be that is.
I've accepted Chris Dodd's answer, but I want to elaborate a little on it, so this is a restatement of that answer in my own words.
The original example has undefined behavior because, in this line:
std::ostream os(&sb); // Passing "invalid" pointer here? UB?
the expression &sb
has type mystreambuf*
, but is being passed to a
constructor that accepts std::streambuf*
, and therefore must undergo
derived-to-base conversion. That conversion, applied to a pointer to an
unconstructed object with non-trivial constructor, has undefined
behavior since it is a "[reference] to any [...] base class of the
object", which is prohibited by
class.cdtor p1.
The example in that section further clarifies. Quoting the key lines from it:
struct X { int i; };
struct Y : X { Y(); }; // non-trivial
struct A { int a; };
struct B : public A { int j; Y y; }; // non-trivial
extern B bobj;
A* pa = &bobj; // undefined behavior: upcast to a base class type
B bobj; // definition of bobj
Moreover, this means that not only is the specific example in the
question undefined, but it is in general undefined to do what the
question title says, namely to "pass a pointer to an unconstructed
streambuf object to the ostream constructor". That is because the
std::streambuf
constructor is protected, so an
instance must always be a proper base class subobject, and therefore the
only way to obtain a std::streambuf*
is with a derived-to-base
conversion.
That implies that the code quoted from the Clang libc++ would have undefined behavior if it were user code, and I have filed Issue #93307 against Clang about that.
Upvotes: 9
Views: 314
Reputation: 121
So, @JerryCoffin’s answer is correct, but there is an objection to it on the grounds that while the standard clearly specifies what basic_ios::init()
does, it doesn’t specify what it doesn’t do. So (the objection goes), while the standard asserts that the only things basic_ios::init()
does with the passed pointer are compare it to nullptr
and store it… it might also dereference it, which would trigger UB in the situation described.
Okay, let’s assume that logic makes sense.
So, because basic_ios::init()
“might” dereference the pointer, and because the basic_ostream
constructor calls basic_ios::init()
, we can’t pass a pointer to a member. So we can’t do this:
class myostream :
public std::ostream
{
std::streambuf _buf;
public:
myostream() : std::ostream{&_buf} {}
// other stuff...
};
Because although the standard specifies that the postconditions of the ostream
conductor (indirectly/transitively) just compare the pointer passed to nullptr
and keep a copy… the postconditions are not necessary exhaustive. So it might dereference the pointer for some unknown reasons.
If so, that would be UB. So how would we avoid that?
The solution offered looks like this:
class myostream :
private std::streambuf,
public std::ostream
{
public:
myostream() : std::ostream{this} {}
// other stuff...
};
So, great! Problem solved, right?
Well, no.
Because, you see, the standard doesn’t say that the ostream
constructor or basic_ios::init()
don’t delete the pointer passed.
basic_ios::init()
might do this:
auto basic_ios::init(streambuf* p_buf)
{
// do all the stuff init() is specified to do, and then...
delete p_buf;
}
Why not? The postconditions don’t say explicitly that the stream buffer pointed to by the argument won’t be deleted. And that doesn’t contradict the postconditions.
Or maybe it does this:
auto basic_ios::init(streambuf* p_buf)
{
// do all the stuff init() is specified to do, and then...
p_buf->~streambuf();
::new (static_cast<void*>(p_buf)) streambuf{};
}
Again, why not? That wouldn’t literally contradict the precise wording of the contract of basic_ios::init()
as spelled out in the standard. So it could happen, right?
If you suppose that basic_ios::init()
is free to do anything with the pointer that it doesn’t explicitly say it won’t, then your clever inheritance strategy won’t work either. In fact… literally nothing will work. If basic_ios::init()
is allowed to do LITERALLY ANYTHING with the pointer you pass it—so long as it doesn’t contradict the explicit wording of the contract—then you can’t assume anything about the stream buffer pointer you pass to it. You can’t assume it won’t be destroyed. You can’t assume it will be destroyed. You can’t assume it won’t be overwritten.
So basically, basic_ios::init()
is just impossible to use safely. Which means it is impossible to create our own output streams, because we must call basic_ios::init()
, directly or indirectly, at some point (before the destructor, or any member functions).
So, there’s your conclusion. It is just impossible to create your own custom streams or stream buffers, because the standard writers didn’t explicitly rule out every asininely imaginable possible contingency for what might happen with that pointer.
Or… maybe… our logic went off the rails somewhere.
Look, the people writing the standard are not doing it for the sake of a group of D&D players who get off on picking apart the micro-semantics of every single rule clause looking for a way to game the system. The committee has neither the time nor the patience to cater to every absurd rule-twisting fanatic’s desire to find loopholes. They will include as much explicit detail as is necessary for reasonable implementers to produce implementations that behave consistently with each other, and with the understanding that reasonable readers of the standard will interpret from it.
So let’s approach this like reasonable people.
The standard specifies what basic_ios::init()
does with the pointer passed. It says nothing about the pointer being dereferenced, not even a non-normative note suggesting that might be the case.
Yes, it does not explicitly state that the pointer won’t be dereferenced (or deleted, or anything else). But consider this: As I pointed out in another comment, Clang’s libc++ does basically what the first code block above does. If there were a reasonable interpretation of the contract of basic_ios::init()
that implied the pointer might be dereferenced… wouldn’t somebody have noticed the problem in the decade or so that libc++ has been in widespread use? Don’t you think that, maybe, a sanitizer or two might have noticed?
And, out of curiosity, I also checked the Microsoft standard library source code. Yup, it does the same thing: passes a pointer to a stream buffer data member. That’s two major, widely-used standard libraries. I don’t know how long that particular standard library has been in use, but again… don’t you think somebody would have raised the issue by now if it were a reasonable interpretation of the standard that that stream buffer pointer might be de-referenced before the stream buffer is constructed?
(And I can’t dig up my copy of Langer & Kreft right now, but I’m pretty sure they do the same thing, too.)
Once again: be reasonable. IOstreams has been in the standard since 1998, and it was a widely used library even before that, going back as far as 1984. The wording has been pored over, revised, and studied in dozens and dozens of defect reports. If “it doesn’t say it doesn’t dereference” were a reasonable interpretation of the standard’s definition of basic_ios::init()
… don’t you think someone would have done something about that sometime in the last ~30–40 years? Don’t you think someone working on or with the Microsoft standard library OR Clang’s standard library—or one of the many, many people who have made their own custom streams (including the people making new standard custom streams, like in networking proposals)—would have pointed out the issue?
Be reasonable. The standard doesn’t have to explicitly say the pointer won’t be dereferenced, because that would be a pants-crappingly stupid thing to do to a pointer that you haven’t specified must point to a valid stream buffer. Everything else in basic_ios
follows that reasoning: the destructor also doesn’t delete the pointer. Indeed, if basic_ios::init()
were allowed to dereference the pointer, that would wildly complicate the process of making a custom stream. And for what? For what gain? Why would the IOStreams library be better if it did allow for basic_ios::init()
to dereference the stream pointer? How would that compare to the many ways it would be massively worse if you couldn’t assume it was safe to pass a pointer to a member stream buffer?
Conclusion: The fact that the standard wording doesn’t explicitly state… that the things it explicitly states it does with the stream buffer pointer are the only things it does with it… does not imply it may do any random thing with the stream buffer pointer. Especially things that might create UB if they were done unexpectedly. If it required a pointer to a valid stream buffer, it would say so. It does not, and instead lists a bunch of things that don’t require a valid stream buffer.
Suggestion: Don’t treat the standard like a riddle and pick through its wording looking for traps.
CLEARLY the intention is for basic_ios::init()
to just compare the pointer to nullptr
and keep a copy. It makes no damn sense to not have that be the implication and instead require stream implementers to resort to gymnastics like multiple inheritance (or dynamic allocation followed by rdbuf()
to retrieve the pointer later, or other wacky, circuitous ideas). I mean… why? Why would you design the library like that? That would be absurd. Why would you so unnecessarily hamstring the obvious and safest way to implement a stream with an underlying stream buffer?
tl;dr: 1) @JerryCoffin is correct that the behaviour is defined, by reasonable implication from the standard wording. 2) The first code block is fine, and you can pass a pointer to an uninitialized stream buffer to basic_ios::init()
. 3) Two major standard libraries work that way, and have done so for decades without any concern raised. 4) There are no rhetorical traps in the C++ standard.
Upvotes: 2
Reputation: 126418
The language you quoted
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. [...]
would seem to indicate this is undefined behavior -- you're referring to the base class (std::streambuf
) of an object before the constructor has run. What happens in the ostream constructor is irrelevant.
Upvotes: 3