Mark
Mark

Reputation: 23

Perl regex multiline zero or more occurrences

I've got the following text:

#ifdef blah
a
#else 
b
#endif

#ifdef blah
c
#endif

I'd like to create a perl regex that can be used to delete/replace the #ifdef blah / #endif and anything wrapped by them and to keep just whatever is under the #else if it exists. The above text after the action is performed should be:

b

I've tried something like this: perl -i.bak -pe 'BEGIN{undef $/;} s/^#ifdef blah.*(^#else blah(.*))?#endif blah/\2/smg' test.c

However, their appears to be a problem marking the #else as occurring zero or more times and nothing gets selected.

Upvotes: 2

Views: 450

Answers (2)

geekosaur
geekosaur

Reputation: 61369

The regex you gave matches a single character after #ifdef blah (which will probably be a newline) and immediately expects to see ^#else. Also, from the looks of it, you're using "blah" as a wildcard for "anything"?

s/^
   \# \s* ifdef \s+ \w+ \s* \n          # start of ifdef
     .*? \n                             # ignore true case
   (?:
     \# \s* else \s* \n                 # there's an else leg
       (.*? \n)                         # ...so capture it
   )?                                   # or there isn't
   \# \s* endif \s* \n                  # end it
 /defined $1 ? $1 : ''/smgex;           # if we got an else, return it, otherwise do nothing

Note that a regex is not going to handle nested #ifdefs properly (this is a simpler version of why you shouldn't try to parse HTML with a regex). You can force it to work for this simple case with some evil, but it's still getting rather too close to the Old Ones for comfort. Best for that case is to use a real parser.

Or you can eschew reinventing the wheel and use unifdef.

Upvotes: 1

Mario
Mario

Reputation: 36487

Didn't try it live but this pattern should to the trick:

$whatever ~= s/#ifdef.*?(?:#else\n(.*?))?#endif/\1/si

Note that this won't check for any #elif (you could include it similar to the #else part).

Upvotes: 0

Related Questions