Rudolfs Bundulis
Rudolfs Bundulis

Reputation: 11934

Differences in regex support between gcc 4.9.2 and gcc 5.3

Can anyone more familiar with gcc point out why the sample below fails to match on gcc 4.9.2 but succeeds on gcc 5.3? Is there anything I can do to alternate the pattern so that it would work (also seems to work fine on VS 2013)?

#include <iostream>
#include <regex>

std::regex pattern("HTTP/(\\d\\.\\d)\\s(\\d{3})\\s(.*)\\r\\n(([!#\\$%&\\*\\+\\-\\./a-zA-Z\\^_`\\|-]+\\:[^\\r]+\\r\\n)*)\\r\\n");

const char* test = "HTTP/1.1 200 OK\r\nHost: 192.168.1.72:8080\r\nContent-Length: 86\r\n\r\n";

int main()
{
    std::cmatch results;
    bool matched = std::regex_search(test, test + strlen(test), results, pattern);
    std::cout << matched;
    return 0; 
}

I assume I am using something that is not supported in gcc 4.9.2 but was added on or fixed later, but I have no idea where to look it up.

UPDATE

Due to the amount of help and suggestions I tried to backtrack the issue instead of just switching to gcc 5. I get correct matches with this modification:

#include <iostream>
#include <regex>

std::regex pattern("HTTP/(\\d\\.\\d)\\s(\\d{3})\\s(.*?)\\r\\n(?:([^:]+\\:[^\\r]+\\r\\n)*)\\r\\n");

const char* test = "HTTP/1.1 200 OK\r\nHost: 192.168.1.72:8080\r\nContent-Length: 86\r\n\r\n";

int main()
{
    std::cmatch results;
    bool matched = std::regex_search(test, test + strlen(test), results, pattern);
    std::cout << matched << std::endl;
    if (matched)
    {
        for (const auto& result : results)
        {
            std::cout << "matched: " << result.str() << std::endl;
        }
    }
    return 0;
}

So I guess the problem is with the group that matches the HTTP header name. Will check further.

UPDATE 2

std::regex pattern(R"(HTTP/(\d\.\d)\s(\d{3})\s(.*?)\r\n(?:([!#$&a-zA-Z^_`|-]+\:[^\r]+\r\n)*)\r\n)")

is the last thing that works. Adding any of the remaining characters that I had in my group - %*+-. (escaped or not epscaped) - breaks it.

Upvotes: 12

Views: 1328

Answers (1)

HackerBoss
HackerBoss

Reputation: 829

So I know GCC did not support the c++11 regex library until GCC 4.9 officially. See Is gcc 4.8 or earlier buggy about regular expressions?. Since it was so new, it is likely that it had a few bugs to smooth out. Pinning down the exact cause would be difficult, but the problem is in the implementation and not in the regex.

Side note: I remember spending 20 minutes one time trying to figure out what was wrong with my regex when I found the mentioned article and realized that I was using gcc 4.8.*. Since the machine I had to run on wasn't mine, I basically ended up compiling on a different, similar platform with a later version of gcc and a few hacks and then it ran on the target platform.

Upvotes: 2

Related Questions