LnlB
LnlB

Reputation: 300

PCRE regex with multiples substrings

I would like to get 2 substrings from this line :

open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

In the first I must have /etc/ld.so.cache and in the second /etc/ld.so.cache.

So I wrote this

int main()
{

    char * line = "open(\"/etc/ld.so.cache\", O_RDONLY|O_CLOEXEC) = 3";

    int         rc;
    size_t      nmatch = 3;

    regex_t     reg;
    regmatch_t  pmatch[3];

    char * regex = "open\(\"\\([^\"]*\\)\",[ ]\\([^\)]*\\)\).*";

    rc = regcomp(&reg, regex, REG_NOSUB | REG_EXTENDED);

    rc = regexec(&reg, line, nmatch, pmatch, 0);
    if (!rc) {
            printf("Matched substring \"%.*s\" is found at position %d to %d.\n",
                     pmatch[1].rm_eo - pmatch[1].rm_so, &line[pmatch[1].rm_so],
                     pmatch[1].rm_so, pmatch[1].rm_eo - 1);

    }

    regfree(&reg);

    return 0;
}

But it doesn't return the first group.

Could you tell me, if my regex is good ?

Upvotes: 0

Views: 56

Answers (1)

Wintermute
Wintermute

Reputation: 44063

Nearly. In POSIX regexes, parentheses and other special characters have to be escaped if you want them to match themselves, not to access their special function, so it has to be

char const * regex = "open\\(\"([^\"]*)\", *([^\\)]*)\\).*";

In addition, if you want the captures, you have to compile the regex without REG_NOSUB:

rc = regcomp(&reg, regex, REG_EXTENDED);

...and the printf will probably segfault on you at the moment; the arguments don't match the format string.

Upvotes: 1

Related Questions