Reputation: 1693
I would like to play a regexp on each lines:
127.0.0.1 localhost
# 127.0.0.1 fake
1.2.3.4 foo bar baz
The goal is to ignore when it starts with a #
, otherwise I want to capture the ip and each strings after it.
Here is my attempt:
{^\s?(?<ip>[^#\s]+)(?:\s+(?<domain>[^\s]+))*$}
My problem is that when I play this on 1.2.3.4 foo bar baz
it only capture baz
, not foo
and bar
. I would like every domains.
PS: I'm using PHP. You can try it here: https://regex101.com/r/S8Fzlu/1
Upvotes: 2
Views: 67
Reputation: 786359
PHP regex engine or PCRE
doesn't allow dynamic capture group creation when using a group with a quantifier. It returns only the last captured string. That's the reason you're seeing baz
being captured in 2nd capture group.
However you may leverage \G
(kind of word boundary) and capture all strings using preg_match_all
using this regex:
(?:^\h*(?<ip>(?:\d+\.){3}\d+)|(?!^)\G)\h+(?<domain>\S+)
\G
asserts position at the end of the previous match or the start of the string for the first matchCode:
$str = '1.2.3.4 foo bar baz';
$re = '/(?:^\h*(?<ip>(?:\d+\.){3}\d+)|(?!^)\G)\h+(?<domain>\S+)/';
preg_match_all($re, $str, $m);
print_r($m['ip']);
print_r($m['domain']);
Output:
Array
(
[0] => 1.2.3.4
[1] =>
[2] =>
)
Array
(
[0] => foo
[1] => bar
[2] => baz
)
Upvotes: 1
Reputation: 10940
I'm not sure how php RegEx Works, but this RegEx Works in JavaScript and C#, give it a try:
^\s?(?<ip>[^#\s]+)(?:\s+(?<domain>[^.]+)*)$
Note I have moved the '*' outside the parantese.
Upvotes: 0