Reputation: 1069
I need to extract a predefined set of hashtags from a blob of text, then extract what number follows right after it if any. Eg. I'd need to extract 30 from "Test string with #other30 hashtag". I assumed preg_match_all would be the right choice.
Some test code:
$hashtag = '#other';
$string = 'Test string with #other30 hashtag';
$matches = [];
preg_match_all('/' . $hashtag . '\d*/', $string, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => #other30
)
)
Perfect... Works as expected. Now to extract the number:
$string = $matches[0][0]; // #other30
$matches = [];
preg_match_all('/\d*/', $string, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
[4] =>
[5] =>
[6] => 30
[7] =>
)
)
What? Looks like it's trying to match every character?
I'm aware of some preg_match_all related answers (one, two), but they all use a parenthesized subpattern. According to documentation - it is optional.
What am I missing? How do I simply get all matches into an array that match such a basic regex like /\d*/ There doesn't seem to be a more appropriate function in php for that.
I never thought I'd be scratching my head with such a basic thing in PHP. Much appreciated.
Upvotes: 3
Views: 2054
Reputation: 18490
Also see, that you can reset after a certain point to get part of a match by using \K
. And of course need to use \d+
instead of \d*
to match one or more digits. Else there would be matches in gaps in between the characters where zero or more digits matches.
So your code can be reduced to
$hashtag = '#other';
$string = 'Test string with #other30 #other31 hashtag';
preg_match_all('/' . $hashtag . '\K\d+/', $string, $matches);
print_r($matches[0]);
See the demo at eval.in and consider using preg_quote for $hashtag
.
Upvotes: 1
Reputation: 9654
<?php
$hashtag = '#other';
$string = 'Test string with #other30 hashtag';
$matches = [];
preg_match_all('/' . $hashtag . '\d*/', $string, $matches);
$string = preg_match_all('#\d+#', $matches[0][0], $m);
echo $m[0][0];
?>
Upvotes: 0
Reputation: 2134
You need to replace:
preg_match_all('/\d*/', $string, $matches);
with:
preg_match_all('/\d+/', $string, $matches);
Replace *
with +
Because
*
Match zero or more times.
+
Match one or more times.
Upvotes: 2
Reputation: 785058
You can use a capturing group:
preg_match_all('/' . $hashtag . '(\d*)/', $string, $matches);
echo $matches[1][0] . "\n";
//=> 30
Here (\d*)
will capture the number after $hashtag
.
Upvotes: 1