Reputation: 57
I want to create a regex that saves all of $text1
and $text2
in two separade arrays. text1 and text2 are: ($text1)[$text2]
that exist in string.
I wrote this code to parse between brackets as:
<?php
preg_match_all("/\[[^\]]*\]/", $text, $matches);
?>
It works correctly .
And I wrote another code to parse between parantheses as:
<?php
preg_match('/\([^\)]*\)/', $text, $match);
?>
But it just parses between one of parantheses not all of the parantheses in string :(
So I have two problems:
1) How can I parse text between all of the parantheses in the string?
2) How can I reach
$text1
and$text2
as i described at top?
Please help me. I am confused about regex. If you have a good resource share it link. Thanks ;)
Upvotes: 1
Views: 1353
Reputation: 47894
The only reason you were failing to capture multiple (
)
wrapped substrings is because you were calling preg_match()
instead of preg_match_all()
.
A couple of small points:
)
inside of your negated character class didn't need to be escaped.i
pattern modifier, you have no letters in your pattern to modify.Combine your two patterns into one and bake in my small points and you have a fully refined/optimized pattern.
In case you don't know why your patterns are great, I'll explain. You see, when you ask the regex engine to match "greedily", it can move more efficiently (take less steps).
By using a negated character class, you can employ greedy matching. If you only use .
then you have to use "lazy" matching (*?
) to ensure that matching doesn't "go too far".
Pattern: ~\(([^)]*)\)\[([^\]]*)]~
(11 steps)
The above will capture zero or more characters between the parentheses as Capture Group #1, and zero or more characters between the square brackets as Capture Group #2.
If you KNOW that your target strings will obey your strict format, you can even remove the final ]
from the pattern to improve efficiency. (10 steps)
Compare this with lazy .
matching. ~\((.*?)\)\[(.*?)]~
(35 steps) and that's only on your little 16-character input string. As your text increases in length (I can only imagine that you are targeting these substrings inside a much larger block of text) the performance impact will become greater.
My point is, always try to design patterns that use "greedy" quantifiers in pursuit of making the best / most efficient pattern. (further tips on improving efficiency: avoid piping (|
), avoid capture groups, and avoid lookarounds whenever reasonable because they cost steps.)
Code: (Demo)
$string='Demo #1: (11 steps)[1] and Demo #2: (35 steps)[2]';
var_export(preg_match_all('~\(([^)]*)\)\[([^\]]*)]~',$string,$out)?array_slice($out,1):[]);
Output: (I trimmed off the fullstring matches with array_slice()
)
array (
0 =>
array (
0 => '11 steps',
1 => '35 steps',
),
1 =>
array (
0 => '1',
1 => '2',
),
)
Or depending on your use: (with PREG_SET_ORDER
)
Code: (Demo)
$string='Demo #1: (11 steps)[1] and Demo #2: (35 steps)[2]';
var_export(preg_match_all('~\(([^)]*)\)\[([^\]]*)]~',$string,$out,PREG_SET_ORDER)?$out:[]);
Output:
array (
0 =>
array (
0 => '(11 steps)[1]',
1 => '11 steps',
2 => '1',
),
1 =>
array (
0 => '(35 steps)[2]',
1 => '35 steps',
2 => '2',
),
)
Upvotes: 1
Reputation: 9937
Use preg_match_all()
with the following regular expression:
/(\[.+?\])(\(.+?\))/i
Details
/ # begin pattern
( # first group, brackets
\[ # literal bracket
.+? # any character, one or more times, greedily
\] # literal bracket, close
) # first group, close
( # second group, parentheses
\( # literal parentheses
.+? # any character, one or more times, greedily
\) # literal parentheses, close
) # second group, close
/i # end pattern
Which will save everything between brackets in one array, and everything between parentheses in another. So, in PHP:
<?php
$s = "[test1](test2) testing the regex [test3](test4)";
preg_match_all("/(\[.+?\])(\(.+?\))/i", $s, $m);
var_dump($m[1]); // bracket group
var_dump($m[2]); // parentheses group
Upvotes: 2