Reputation: 5
I'm looking to split a string by spaces, unless there is the string " NOT ", in which case I would only want to split by the space before the "NOT", and not after the "NOT". Example:
"cancer disease NOT brain NOT sickle"
should become:
["cancer", "disease", "NOT brain", "NOT sickle"]
Here is what I have so far, but it is incorrect:
$splitKeywordArr = preg_split('/[^(NOT)]( )/', "cancer disease NOT brain NOT sickle")
It results in:
["cance", "diseas", "NOT brai", "NOT sickle"]
I know why it is incorrect, but I don't know how to fix it.
Upvotes: 0
Views: 54
Reputation: 43169
You may use
<?php
$text = "cancer disease NOT brain NOT sickle";
$pattern = "~NOT\s+(*SKIP)(*FAIL)|\s+~";
print_r(preg_split($pattern, $text));
?>
Which yields
Array
(
[0] => cancer
[1] => disease
[2] => NOT brain
[3] => NOT sickle
)
See a demo on ideone.com.
Upvotes: 2
Reputation: 163362
You might also match optional repetitions of the word NOT followed by 1+ word characters in case the word occurs multiple times after each other.
(?:\bNOT\h+)*\w+
The pattern matches:
(?:
Non capture group\bNOT\h+
A word boundary, match NOT and 1 or more horizontal whitespace chars)*
Close non capture group and optionally repeat\w+
Match 1+ word characters$str = "cancer disease NOT brain NOT sickle";
preg_match_all('/(?:\bNOT\h+)*\w+/', $str, $matches);
print_r($matches[0]);
Output
Array
(
[0] => cancer
[1] => disease
[2] => NOT brain
[3] => NOT sickle
)
Upvotes: 1