Dave
Dave

Reputation: 1001

PHP preg_match_all capture all patterns at front of string not mid string

given the subject

AB: CD:DEF: HIJ99:message packet - no capture

I have crafted the following regex to capture correctly the 2-5 character targets which are all followed by a colon.

/\s{0,1}([0-9a-zA-Z]{2,5}):\s{0,1}/

which returns my matches even if erronious spaces are added before or after the targets

[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99

However, if the message packet contains a colon in it anywhere, for example

AB: CD:DEF: HIJ99:message packet no capture **or: this either**

it of course includes [4] => or in the resulting set, which is not desired. I want to limit the matches to a consecutive set from the beginning, then once we lose concurrency, stop looking for more matches in the remainder

Edit 1: Also tried ^(\s{0,1}([0-9a-zA-Z]{2,5}):\s{0,1}){1,5} to force checking from the beginning of the string for multiple matches, but then I lose the individual matches

[0] => Array
    (
        [0] => AB: CD:DEF: HIJ99:
    )

[1] => Array
    (
        [0] => HIJ99:
    )

[2] => Array
    (
        [0] => HIJ99
    )

Edit 2: keep in mind the subject is not fixed.

AB: CD:DEF: HIJ99:message packet - no capture

could just as easily be

ZY:xw:VU:message packet no capture or: this either

for the matches we are trying to pull, with the subject being variable as well. Just trying to filter out the chance of matching a ":" in the message packet

Upvotes: 2

Views: 71

Answers (2)

Avinash Raj
Avinash Raj

Reputation: 174756

You could use \G to do a consecutive string match.

$str = 'AB: CD:DEF: HIJ99:message packet no capture or: this either';
preg_match_all('/\G\s*([0-9a-zA-Z]{2,5}):\s*/', $str, $m);
print_r($m[1]);

Output:

Array
(
    [0] => AB
    [1] => CD
    [2] => DEF
    [3] => HIJ99
)

DEMO

Upvotes: 1

Toto
Toto

Reputation: 91488

How about:

$str = 'AB: CD:DEF: HIJ99:message packet no capture or: this either';
preg_match_all('/(?<![^:]{7})([0-9a-zA-Z]{2,5}):/', $str, $m);
print_r($m);

Output:

Array
(
    [0] => Array
        (
            [0] => AB:
            [1] => CD:
            [2] => DEF:
            [3] => HIJ99:
        )

    [1] => Array
        (
            [0] => AB
            [1] => CD
            [2] => DEF
            [3] => HIJ99
        )

)

Upvotes: 1

Related Questions