Reputation: 73
I have the following string:
H: 290‐314 P: 280‐301+330U+200B+331string‐305+351+338‐308+310 [2]
I need all the numbers after P:
: [280,301,330,331,305,351,338,308,310]
.
Note that there is this U+200B
which is a char-code and should be ignored.
I tried #P:\s((\d+)[\‐]+)+#
but that doesn't work.
Upvotes: 1
Views: 197
Reputation: 47991
I'd use the continue operator this way: (Demo)
$str = 'H: 290‐314 P: 280‐301+330U+200B+331string‐305+351+338‐308+310 [2]';
preg_match_all('~(?:P: |\G(?!^)(?:U\+200B)?[^\d ]+)\K\d+~', $str, $m);
var_export($m[0]);
Start from P:
then match consecutive digits.
Consume non-digit, non-spaces, and your blacklisted string as delimiters.
Forget unwanted substrings with \K
.
Upvotes: 1
Reputation: 627093
You can use
(?:\G(?!\A)(?:[^\d\s]*200B)?|P:\h*)[^\d\s]*\K(?!200B)\d+
See the regex demo.
Details:
(?:\G(?!\A)(?:[^\d\s]*200B)?|P:\h*)
- either the end of the previous successful match and then any zero or more chars other than digits/whitespace and 200B
, or P:
and zero or more horizontal whitespaces[^\d\s]*
- zero or more chars other than digits and whitespace\K
- match reset operator that discards the text matched so far from the overall match memory buffer(?!200B)\d+
- one or more digits that are not starting the 200B
char sequence.See the PHP demo:
$text = 'H: 290‐314 P: 280‐301+330U+200B+331string‐305+351+338‐308+310 [2]';
if (preg_match_all('~(?:\G(?!\A)(?:[^\d\s]*200B)?|P:\h*)[^\d\s]*\K(?!200B)\d+~', $text, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => 280
[1] => 301
[2] => 330
[3] => 331
[4] => 305
[5] => 351
[6] => 338
[7] => 308
[8] => 310
)
Upvotes: 0