Thomas Eizinger
Thomas Eizinger

Reputation: 1462

strange behaviour of regex in c++11 under linux

I've been sitting here for nearly a day now and cannot figure out, why the C++11 regex library gives me the output it does. It is not about finding the pattern, I already designed and tested it in various regex-testers out there. (Regexpal for example)

An example string I want to process would be:

if12b031, if12b141, ic12a042

These are usernames, containing letters and numbers at a maximum length of 8 characters, each username septerated by a comma. The string is entered by the user and must not end with a comma. The spaces between the commas are optional.

This pattern was my approach to solve this problem:

^[A-z0-9]{1,8}(\s*,\s*[A-z0-9]{1,8})*$

Here the user has to enter a least 1 username, but can enter as many as he wants, as long as they are seperated by comma and have a maximum length of 8 characters.

Now the strang thing is, this pattern works, if I test it in the regex-tester mentioned above. But it doesn't in my code.

I've created a small example program, where it is only about pattern testing.

#include <regex>
#include <string>
#include <iostream>

using namespace std;

int main(int argc, char const *argv[])
{
string tmp;
string pattern = "^[A-z0-9]{1,8}(\\s*,\\s*[A-z0-9]{1,8})*$";

while(true)
{
    getline(cin, tmp);

    cout << "input: " << tmp << endl;
    cout << "pattern: " << pattern << endl;

    try {
        if(regex_match(tmp, regex(pattern, std::regex_constants::basic))) {
            cout << "match" << endl;
        }
        else
        {
            cout << "no match" << endl;
        }
    } catch (std::regex_error& e) {
        cout << e.code() << endl;
    }
}
return 0;
}

I compiled using the following code:

c++ -std=c++11 -o test test.cpp

Now the strange thing is, I cannot even get simple patterns like [A-z]{1,8} to work. It just gives me a match, if I enter a single character, but it also matches if I enter a number and I just don't understand why.

It always prints out "no match", as soon as the input length exceeds 1. And it seems, as regex_match does not care about the pattern, as long as the input length is 1.

Why is that? I honestly can't see where I am making a mistake here. It even matches some special characters like $ or %, but it doesn't match §.

If tried several regex_constants in the constructor of the regex object.

I am honestly out of ideas, why this doesn't work.

I am running Ubuntu 13.10 64bit Gnome in a virtual machine (VMWare), but I also tried it on my laptop, where it is installed as a dual-boot system. gcc version is 4.8.1.

As this is my first question, I hope I provided enough details for you guys to help me out. Thanks in advance.

Upvotes: 1

Views: 677

Answers (1)

KillianDS
KillianDS

Reputation: 17176

gcc's regex implementation might compile, but that's about it, it is mainly unimplemented in gcc 4.8 (see item 28).

Upvotes: 5

Related Questions