Lightness Races in Orbit
Lightness Races in Orbit

Reputation: 385144

Why do different GCC 4.9.2 installations give different results for this regex match?

I posted the following code on ideone and Coliru:

#include <iostream>
#include <regex>
#include <string>

int main() 
{
    std::string example{"   <match1>  <match2>    <match3>"};
    std::regex re{"<([^>]+)>"};
    std::regex_token_iterator<std::string::iterator> it{example.begin(), example.end(), re, 1};
    decltype(it) end{};
    while (it != end) std::cout << *it++ << std::endl;
    return 0;
}

Both sites use GCC 4.9.2. I don't know what compilation arguments ideone uses, but there is nothing unusual in Coliru's.

Coliru doesn't give me the match1 result:

Coliru

# g++ -v 2>&1 | grep version; \
# g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
gcc version 4.9.2 (GCC) 
match2
match3

ideone (and, incidentally, Coliru's clang 3.5.0 using libc++)

match1
match2
match3

Does my code have undefined behaviour or something? What could cause this?

Upvotes: 18

Views: 1139

Answers (2)

ecatmur
ecatmur

Reputation: 157344

It's a bug in libstdc++'s regex_token_iterator copy constructor, as called by the postincrement operator. The bug was fixed in December 2014; versions of gcc 4.9 and 5.x released since then will have the fix. The nature of the bug is that the copy of the iterator aliases the target of the copy, leading to the observed behavior.

The workaround is to use preincrement - this is desirable from a microoptimisation point of view as well, as regex_token_iterator is a reasonably heavy class:

for (; it != end; ++it) std::cout << *it << std::endl;

Upvotes: 24

Lightness Races in Orbit
Lightness Races in Orbit

Reputation: 385144

The code is valid.

The only plausible explanation is that the standard library versions differ; although for the most part standard library implementations are shipped with compilers, they can be upgraded independently through, say, a Linux package manager.

In this instance it seems that this is a libstdc++ fault that was fixed late last year:

The most likely match on Bugzilla that I can find is bug 63497 but, to be honest, I'm not convinced this particular bug was ever fully covered by Bugzilla. Joseph Mansfield identified that these specific symptoms in this specific case are triggered by the post-fix increment, at least.

Upvotes: 8

Related Questions