Vigintas Labakojis
Vigintas Labakojis

Reputation: 1069

php preg_match_all simple regex returns empty values

I need to extract a predefined set of hashtags from a blob of text, then extract what number follows right after it if any. Eg. I'd need to extract 30 from "Test string with #other30 hashtag". I assumed preg_match_all would be the right choice.

Some test code:

$hashtag = '#other';
$string  = 'Test string with #other30 hashtag';
$matches = [];
preg_match_all('/' . $hashtag . '\d*/', $string, $matches);
print_r($matches);

Output:

Array
(
    [0] => Array
        (
            [0] => #other30
        )
)

Perfect... Works as expected. Now to extract the number:

$string = $matches[0][0]; // #other30
$matches = [];
preg_match_all('/\d*/', $string, $matches);
print_r($matches);

Output:

Array
(
    [0] => Array
        (
            [0] =>
            [1] =>
            [2] =>
            [3] =>
            [4] =>
            [5] =>
            [6] => 30
            [7] =>
        )
)

What? Looks like it's trying to match every character?

I'm aware of some preg_match_all related answers (one, two), but they all use a parenthesized subpattern. According to documentation - it is optional.

What am I missing? How do I simply get all matches into an array that match such a basic regex like /\d*/ There doesn't seem to be a more appropriate function in php for that.

I never thought I'd be scratching my head with such a basic thing in PHP. Much appreciated.

Upvotes: 3

Views: 2054

Answers (4)

bobble bubble
bobble bubble

Reputation: 18490

Also see, that you can reset after a certain point to get part of a match by using \K. And of course need to use \d+ instead of \d* to match one or more digits. Else there would be matches in gaps in between the characters where zero or more digits matches.

enter image description here

So your code can be reduced to

$hashtag = '#other';
$string  = 'Test string with #other30 #other31 hashtag';
preg_match_all('/' . $hashtag . '\K\d+/', $string, $matches);
print_r($matches[0]);

See the demo at eval.in and consider using preg_quote for $hashtag.

Upvotes: 1

Mi-Creativity
Mi-Creativity

Reputation: 9654

PHP Fiddle

<?php

    $hashtag = '#other';
    $string  = 'Test string with #other30 hashtag';
    $matches = [];
    preg_match_all('/' . $hashtag . '\d*/', $string, $matches);
    $string = preg_match_all('#\d+#', $matches[0][0], $m);
    echo $m[0][0];

?>

Upvotes: 0

Muhammad Bilal
Muhammad Bilal

Reputation: 2134

You need to replace:

preg_match_all('/\d*/', $string, $matches);

with:

preg_match_all('/\d+/', $string, $matches);

Replace * with +

Because

* Match zero or more times.

+ Match one or more times.

Upvotes: 2

anubhava
anubhava

Reputation: 785058

You can use a capturing group:

preg_match_all('/' . $hashtag . '(\d*)/', $string, $matches); 
echo $matches[1][0] . "\n";
//=> 30

Here (\d*) will capture the number after $hashtag.

Upvotes: 1

Related Questions