Etchelon
Etchelon

Reputation: 882

GCC lets me dereference an iterator from an empty string

I ran the following code on GCC 4.8 without (apparent) problems

template<class T>
inline
void remove_carriage_return(std::basic_string<T>& s)
{
    static_assert(std::is_same<T, char>::value || std::is_same<T, wchar_t>::value, "Function remove_carriage_return can only accept string or wstring!!\n");

    if (*(s.rbegin()) == '\r')
        s.pop_back();
}

Visual studio crashed instead at runtime, when feeding the function a "" string, because I'm trying to dereference a pointer to no valid data (much like when dereferencing container.end() I guess). The correct code should be:

template<class T>
inline
void remove_carriage_return(std::basic_string<T>& s)
{
    static_assert(std::is_same<T, char>::value || std::is_same<T, wchar_t>::value, "Function remove_carriage_return can only accept string or wstring!!\n");

    if (s.length() > 0 && *(s.rbegin()) == '\r')
        s.pop_back();
}

Is my deduction correct? If so, why did GCC "optimize out my mistake"?

Upvotes: 1

Views: 500

Answers (2)

edmz
edmz

Reputation: 8494

so, why did GCC "optimize out my mistake"?

Have you personally checked that (looking at the assembly code)? Compiler optimizations do not interfer with the program's visible behavior.

Your first code is undefined-behavior prone (if s.length() == 0). It is called undefined behavior because, guess what, anything might happen. Your program might crash or not and that's unpredictable.

Finally, I would have prefered if you had used, imho

 if ( s.begin() != s.end() && *(s.rbegin()) == '\r')
        s.pop_back();

Upvotes: 0

keltar
keltar

Reputation: 18399

It is pointless to expect something that is explicitly stated as undefined to behave in defined way. Iterator could point anywhere, and there is absolutely no guarantee (and no reason to expect) for this memory area to be unaccessible.

However, for performance reasons string could hold data compatible with C-strings (i.e. 0-terminated), so "" would be one byte (0), so rbegin could point onto this 0 byte. Of course it is implementation-defined, as well as iterator itself (could be mere pointer, could be class).

Upvotes: 2

Related Questions