Reputation: 484
I want to match the regex (?<=SEARCH_THIS=").+(?<!"\n)
in C with PCRE.
However, the following code doesn't work as expected.
#include <pcreposix.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
int main(void){
regex_t re;
regmatch_t matches[2];
char *regex = "(?<=SEARCH_THIS=\").+(?<!\"\n)";
char *file = "NO_MATCH=\"0\"\nSOMETHING_ELSE=\"1\"\nSOME_STUFF=\"1\"\nSEARCH_THIS=\"gimme that\"\nNOT_THIS=\"foobar\"\nTHIS_NEITHER=\"test\"\n";
puts("compiling regex");
int compErr = regcomp(&re, regex, REG_NOSUB | REG_EXTENDED);
if(compErr != 0){
char buffer[128];
regerror(compErr, &re, buffer, 100);
printf("regcomp failed: %s\n", buffer);
return 0;
}
puts("executing regex");
int err = regexec(&re, file, 2, matches, 0);
if(err == 0){
puts("no error");
printf("heres the match: [.%*s]",matches[0].rm_eo-matches[0].rm_so,file+matches[0].rm_so);
} else {
puts("some error here!");
char buffer[128];
regerror(err, &re, buffer, 100);
printf("regexec failed: %s\n", buffer);
}
return 0;
}
The console output is:
compiling regex
executing regex
some error here!
regexec failed: No match
I verified the functionality of this regex here Any idea what is going wrong here?
EDIT #1
Compiler Version
$ arm-merlin-linux-uclibc-gcc --version
arm-merlin-linux-uclibc-gcc (GCC) 4.2.1
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Compile Command
$ arm-merlin-linux-uclibc-gcc -lpcre ./re_test.c -o re_test.o
Upvotes: 3
Views: 453
Reputation: 11716
There are actually a few issues with your code.
First, you use %*s
in an attempt to restrict the length of the printed string. However, the integer width before the s
formatter is the minimum length of what gets printed; if the corresponding string's length is less than what's given, it'll be padded with spaces. If the length is greater than what's given, it'll just output the whole string. You'll need some other method of restricting the length of the outputted string (just avoid modifying *file
, because file
points to a constant string).
Second, you specify the REG_NOSUB
option in your regcomp
call, but according to the man page, this means that no substring positions are stored in the pmatch
argument - thus, even if your regexec
did work, the following printf
would be using uninitialized values (which is undefined behavior).
Finally, I suspect the problem is that the \"
and \n
characters need to be doubly-escaped; i.e. you need to use \\\"
and \\n
in your regex string. While the code you gave worked for me (Ubuntu 14.04 x64), the doubly-escaped version also works.
Taking all of this into account, this is the output I get:
compiling regex
executing regex
no error
heres the match: [.gimme that"]
Upvotes: 1