Reputation: 11
Little problem I'm facing. I've got a long string that has many words in it, and I'm trying to split it up, but most parts of the string have a start and end to reference to that is static, however this one only has an end, and the actual bit of string I'm trying to get is dynamic, but it is up to 16 characters, it could be less and the amount of words in the phrase is unknown.
Example:
Name: John Smith Occupation: Doctor Currently Busy Gender: Male
I want to get "Currently Busy" on it's own without getting the end of the other string before.
But I also want to use the same code to get "Not Yet Here" from this string:
Name: John Smith Occupation: Doctor Not Yet Here Gender: Male
I can't find a way, and I don't even know if it's possible so hopefully someone here could help me out.
Upvotes: 1
Views: 156
Reputation: 14931
Not the most elegant way, but here's a solution:
$string = 'Name: John Smith Occupation: Doctor Currently Busy Gender: Male';
$groups = array_filter(preg_split('/\s?\w+:\s?/', $string));
// Split by [\s? => optional space][\w+ => characters a-zA-Z0-9_][:][\s? => optional space]
// $groups[2] contains 'Doctor Currently Busy'
$pieces = explode(' ', $groups[2]);
$pieces = array_reverse($pieces);
$length = 0;$i = 0;$c = count($pieces);$result = array(); // We need this for the loop
// $c and $i are to preserve the first word if the length of all words are < 16 !
foreach($pieces as $piece){
$length += strlen($piece);
$i++;
if($length <= 16 && $c != $i){
$result[] = $piece;
}else{
break;
}
}
$result = array_reverse($result);
$final_result = implode(' ', $result);
echo $final_result; // Currently Busy
Upvotes: 0
Reputation: 10302
Your problem is one that RegEx may not be able to solve. If the value of "occupation" can be one or more words, and it's directly succeeded by another value that could be one or more words, then how would you tell the two phrases apart, as a human?
I'm hoping that at the very least, you have a set of known Occupation
values. If that's the case, then you can craft your expression like this:
(?<=Doctor |Nurse ).*(?= Gender)
The (?<=...)
and (?=...)
bits are lookbehind and lookahead assertions that essentially say "make sure that the expression Doctor |Nurse
appears before the matched phrase (but do not match that part of it), and that the expression Gender
appears after the matched phrase (but also do not match that part of it)."
See this in action: http://regexr.com?34buq
Upvotes: 1