Tommy Tsang
Tommy Tsang

Reputation: 165

c++ std::stringstream gives me weird behavior

The following code give me some unexpected behavior:

#include <map>
#include <iostream>
#include <string>
#include <sstream>

const std::string data1 =
"column1        column2\n"
"1      3\n"
"5      6\n"
"49     22\n";

const std::string data2 =
"column1        column2 column3\n"
"10     20      40\n"
"30     20      10\n";

class IOLoader
{
public:
        // accept an istream and load the next line with member Next()
        IOLoader(std::istream& t_stream) : stream_(t_stream) 
        { 
                for(int i = 0; i < 2; ++i) std::getline(stream_, line_);
        };// get rid of the header

        IOLoader(std::istream&& t_stream) : stream_(t_stream) 
        { 
                for(int i = 0; i < 2; ++i) std::getline(stream_, line_);
        };// get rid of the header

        void Next()
        {
                // load next line
                if(!std::getline(stream_, line_))
                        line_ = "";
        };

        bool IsEnd()
        { return line_.empty(); };

        std::istream& stream_;
        std::string line_;
};

int main()
{
        for(IOLoader data1_loader = IOLoader((std::stringstream(data1))); !data1_loader.IsEnd(); data1_loader.Next())
        {
                std::cout << data1_loader.line_ << "\n";

                // weird result if the following part is uncommented
                /*
                IOLoader data2_loader = IOLoader(std::stringstream(data2));
                std::cout << data2_loader.line_ << "\n";
                data2_loader.Next();
                std::cout << data2_loader.line_ << "\n";
                */
        }
}

I want the class IOLoader to read the string line by line. I get the following result without the commented parts:

1       3
5       6
49      22

This is completely expected. The problem is what happens when I un-comment the part with data2_loader. Now it gives me:

1       3
10      20      40
30      20      10
mn349   22
10      20      40
30      20      10

I have no idea what's going on. This is what I originally expected:

1       3
10      20      40
30      20      10
5       6
10      20      40
30      20      10
49      22
10      20      40
30      20      10

For whatever reason data1 is not read correctly if I create a stringstream with data2. I compile it with g++ 4.9.2. Thanks a lot for your help.

Upvotes: 3

Views: 291

Answers (2)

Michael Veksler
Michael Veksler

Reputation: 8475

Passing an rvalue reference and keeping it around is broken, and will almost certainly cause undefined behavior (UB) down the road. I am referring to the following code, which facilitates but does not directly cause UB:

IOLoader(std::istream&& t_stream) : stream_(t_stream) 
{ 
      for(int i = 0; i < 2; ++i) std::getline(stream_, line_);
};// get rid of the header

The constructor makes it possible for the following line to silently trigger UB:

for(IOLoader data1_loader = IOLoader((std::stringstream(data1))); !data1_loader.IsEnd(); data1_loader.Next())

This line creates a temporary (rvalue) stringstream object. Since this is an rvalue, its reference is merrily passed to the constructor of IOLoader that accepts rvalue-reference. But the constructor that accepts rvalue-reference does not move anything, and simply stores a reference to the temporary stringstream. This is contrary to the normal use of rvalue-references, which is to move the object. By the time the loop body starts, the temporary stringstream is already destroyed, and stream_ refers to a destroyed object. Using such a reference in Next(), or in any other way, is UB.

You can fix this specific instance of the bug by creating a named stingsstream object:

std::stringstream tmp_stream(data1);
for(IOLoader data1_loader = IOLoader(tmp_stream); !data1_loader.IsEnd(); data1_loader.Next())

This will fix the instance but won't fix the core issue. The core issue is the existence of a misleading && constructor. You have two options with the && constructor, either remove it altogether, or make it actually move the stringstream:

class IOLoader
{
...
        IOLoader(std::stringstream&& t_stream) : saved_stream_(std::move(t_stream)), stream_(saved_stream_)
        { 
                for(int i = 0; i < 2; ++i) std::getline(stream_, line_);
        };// get rid of the header

...
        std::stringstream saved_stream_;
        std::istream& stream_;
        std::string line_;
};

The disadvantage is that in this case it will work only with stringstream, and not with similar types such as istringstream. You can make it more generic (a runtime cost of an additional heap allocation) by using templates:

class IOLoader
{
public:
....    
        // enable_if avoids regular references, so that we neither prefer this ctor
        // over the other ctor, nor try to move from a regular lvalue reference.
        template <typename Stream, typename = typename std::enable_if<!std::is_reference<Stream>::value>::type>
        IOLoader(Stream&& t_stream) : saved_stream_(std::make_unique<typename std::decay<Stream>::type>(std::move(t_stream))), stream_(*saved_stream_)
        { 
                for(int i = 0; i < 2; ++i) std::getline(stream_, line_);
        };
...

        std::unique_ptr<std::istream> saved_stream_;
        std::istream& stream_;
        std::string line_;
};

In my opinion this is too complicated for a one-time use and, unless this is going to be used by a lot of code, I'd simply ditch the constructor with the rvalue-reference instead of fixing it.

Upvotes: 2

alter_igel
alter_igel

Reputation: 7202

When you write IOLoader data1_loader = IOLoader((std::stringstream(data1)));, you're binding the IOLoader::stream_ reference member to a temporary, since that std::stringstream(data1) gets destroyed after the constructor. You're left reading from a dangling reference to destroyed object, which is undefined behavior, and absolutely anything may happen as a result. A simple fix would be to declare both stringstream as variables that live for as long as the IOLoader needs them, and remove your IOLoader(std::istream&& t_stream) constructor because it doesn't actually move t_stream, which, as an r-value reference, will generally be a temporary.

std::stringstream ss1 {data1};
for(IOLoader data1_loader = IOLoader(ss1); !data1_loader.IsEnd(); data1_loader.Next()){
    std::cout << data1_loader.line_ << "\n";

    std::stringstream ss2 { data2 };
    IOLoader data2_loader = IOLoader(ss2);
    std::cout << data2_loader.line_ << "\n";
    data2_loader.Next();
    std::cout << data2_loader.line_ << "\n";
}

If you need IOLoader to work very generally with streams that you can't assume ownership of like std::cin, then it makes sense to stick with the reference member. Just be aware that the referenced streams need to live for as long as the stream_ member is being used. Otherwise, if you're only working with std::stringstream, it would be easiest to assume ownership of a stream and make IOLoader::stream_ a value type. For example, you could std::move a stream passed by r-value reference to the constructor.

Upvotes: 3

Related Questions