Reputation: 871
I am writing some unit tests for some methods I am using, and have found a weird bug and would like some Regex advice.
when doing:-
$needle = ' ';
$haystack = 'hello world. this is a unit test.';
$pattern = '/\b' . $needle . '\b/';
preg_match_all($pattern, $haystack, $matches, PREG_OFFSET_CAPTURE, $offset)
I'm expecting the positions to positions found to be
[5, 12, 17, 20, 22, 27]
The same as if I did this, to get none exact whole word matches
while (($pos = strpos($haystack, $needle, $offset)) !== false) {
$offset = $pos + 1;
$positions[] = $pos;
}
However the preg_match_all does not find the 2nd occurrence (12) the space between
. this
Is this to do with the \b boundary flag? How can I resolve this to make sure it picks up other this?
Thanks
Upvotes: 1
Views: 47
Reputation: 72289
You have to change your $pattern
in preg_match_all()
like below:-
<?php
$haystack = 'hello world. this is a unit test.';
while (($pos = strpos($haystack, ' ', $offset)) !== false) {
$offset = $pos + 1;
$positions[] = $pos;
}
echo "<pre/>";print_r($positions);
preg_match_all('/\s/', $haystack, $matches,PREG_OFFSET_CAPTURE);
echo "<pre/>";print_r($matches);
Output:- https://eval.in/725574
Note:- you need to use \s
for checking white-spaces
You can apply an if-else
to change $pattern
based on $needle
:-
if($needle == ''){
$pattern = '/\s/';
}else{
$pattern = '/\b' . $needle . '\b/';
}
Upvotes: 1