geza
geza

Reputation: 29962

What drawbacks would exist if std::string::substr returned std::string_view?

Look at this example (taken from here):

class foo {
    std::string my_str_;

public:
    std::string_view get_str() const {
        return my_str_.substr(1u);
    }
};

This code is bad, because substr returns a temporary std::string, so the returned std::string_view refers to an already-destroyed object. But, if substr returned std::string_view, this problem would not exist.

Besides, it seems logical to me if substr returned std::string_view instead of std::string, because the returned string is a view of the string, and it is more performant, because no copy is made.

Would there be any drawbacks if substr returned std::string_view (besides the obvious drawback: losing some compatibility with C++14 - I'm not underrating the importance of this, I'd just like know whether other drawbacks exists)?

Related question: How to efficiently get a `string_view` for a substring of `std::string`

Upvotes: 7

Views: 1560

Answers (4)

Arthur Tacca
Arthur Tacca

Reputation: 9989

Here is a concrete (if slightly incomplete) example of code that is currently safe, but would become undefined behaviour with the change:

std::string some_fn();
auto my_substr = some_fn().substr(3, 4);
// ... make use of my_substr ...

Arguably the use of auto is a little dubious here, but it is completely reasonable (in my opinion) in the following situation, where repeating the type name would be almost redundant:

const char* some_fn();
auto my_substr = std::string(some_fn()).substr(3, 4);
// ... make use of my_substr ...

Edit: Even if substr() had always returned a std::string_view, you can imagine this code causing some pain, even if only during development/debugging.

Upvotes: 3

Lanting
Lanting

Reputation: 3068

For one, the underlying data structure of a c++ string is kept mostly compatible with a c string (accessible through the c_str() member). C strings are null terminated. So you basically just have a starting char pointer, and keep increment that until the pointer points to 0.

A substring could thus start at an arbitrary position of your original string. However, as you can't just insert a null somewhere in the original string, your substring would still need to end at the same position as the original.

--edit-- as John Zwinck pointed out, c++ strings can contain \0 chars, however this would still mean that substrings would loose their c_str member, as it would require modifying the original string. A drawback of string_view which was also noticed in Using std::string_view with api, what expects null terminated string

Upvotes: 1

The Quantum Physicist
The Quantum Physicist

Reputation: 26276

When string_view was invented, there was too much debate on whether it should be there. All the opposing arguments were flowing from examples like the one you showed.

However, like I always tell everyone with such bad examples: C++ is not Java, and is not Python. C++ is a low-level language, where you have almost full control over memory, and I repeat the cliché from Spiderman: With great power comes great responsibility. If you don't know what string_view is, then don't use it!

The other part of your question has a simple answer, and you answered it yourself:

Would there be any drawbacks if substr returned std::string_view (besides the obvious drawback: losing some compatibility with C++14)?

The harm is that every program that used a copy of the string from substr may not be valid anymore. Backward compatibility is a serious thing in the computer business, which is why Intel's 64-bit processors still accept x86 instructions, which is also why they're not out of business. It costs a lot of money to reinvent the wheel, and money is a major part in programming. So, unless you're planning to throw all C++ in the garbage and start over (like RUST did), you should maintain the old rules in every new version.

You can deprecate stuff, but very carefully and very slowly. But deprecation isn't like changing the API, which is what you're suggesting.

Upvotes: 4

John Zwinck
John Zwinck

Reputation: 249163

The drawback is crystal clear: it would be a significant API breaking change vs every version of C++ going back to the beginning.

C++ is not a language that tends to break API compatibility.

Upvotes: 3

Related Questions