Timo Haberkern
Timo Haberkern

Reputation: 4439

Getting variables from string with regular expressions

i'm getting nuts with an regular expression. I have a string like this:

%closed% closed (%closed_percent%\%), %open% open (%open_percent%\%)

What I need is a regular expression that matches the following:

%closed%
%closed_percent%
%open%
%open_percent%

but not the two \%

At the moment I use:

\%([^\%]+)\%

that gives me:

%closed%
%closed_percent%
%), %
% open (%
...

Anyone can help me with that?

Upvotes: 2

Views: 453

Answers (5)

Qtax
Qtax

Reputation: 33908

The simple way:

%\w+%

Matches: %foo%

Allows (multiple) backslash escapes:

(?<!\\)(?:\\.)*%(\w+)%

Matches only bar in: \%foo% \\%bar% \\\%baz%

...and this allows escapes inside of it too:

(?<!\\)(?:\\.)*%((?:[^\\%\s]+|\\.)+)%

Matches: %foo\%bar%

Use the value of the first capturing group with the last two expressions.

Upvotes: 2

Kobi
Kobi

Reputation: 138007

Assuming no restrictions on what can be in the percent wrapped tokens (including escaped characters), and what characters can be escaped (so backslashes can also be escaped: \\%token% should be valid),
here's a pattern you can use to skip over escaped characters:

\\.|(%([^%\\]|\\.)+%)

This will capture the percent-wrapped tokens, and will capture them in the first group ($1). Escaped characters will also be matched (it's a nice trick to skip over them), but using PHP it is very easy to get just the relevant tokens:

preg_match_all('/\\\\.|(%([^%\\\\]|\\\\.)+%)/', $str, $matches, PREG_PATTERN_ORDER);
$matches = array_filter($matches[1]);

Working example: http://ideone.com/dziCB

Upvotes: 1

Hubro
Hubro

Reputation: 59323

Add negative lookbehinds for the backslashes! That way \% is ignored, as intended.

(?<!\\)\%([^\%]+)(?<!\\)\%

Matches

%closed%

%closed_percent%

%open%

%open_percent%

Upvotes: 0

Lazarus
Lazarus

Reputation: 43064

Try this:

\%([^(\\\%)]+?)\%

matches

%closed%
%closed_percent%
%open%
%open_percent%

for me.

Upvotes: 1

smottt
smottt

Reputation: 3330

Try:

~\%\w+\%~

So, allow only a-z A-Z and _ in your selection.

$str = "%closed% closed (%closed_percent%\%), %open% open (%open_percent%\%)";

preg_match_all("~\%\w+\%~", $str, $matches);

$matches now contains:

Array
(
    [0] => Array
    (
        [0] => %closed%
        [1] => %closed_percent%
        [2] => %open%
        [3] => %open_percent%
    )
)

Upvotes: 0

Related Questions