Reputation: 2291
I am trying to get values after a specific charachter and stopping on other specific charchters. This is what I tried
$whois = 'Registrant Name: Domain Administrator Registrant Organization: Yahoo! Inc. Registrant Street: 701 First Avenue Registrant City: Sunnyvale';
$data = preg_match_all('/:\s(.*?)\s/', $whois, $data_whois);
var_dump($data_whois[1]);
The whois is for yahoo: http://whois.domaintools.com/yahoo.com
CURRENT OUTPUT
1 => string 'Domain' (length=6)
2 => string 'Yahoo!' (length=6)
3 => string '701' (length=3)
4 => string 'Sunnyvale' (length=9)
EXPECTED OUTPUT
1 => string 'Domain Administrator' (length=6)
2 => string 'Yahoo! Inc.' (length=6)
3 => string '701 First Avenue' (length=3)
4 => string 'Sunnyvale' (length=9)
But it's taking only the first word. I believe that is because (.*?)\s
I also tried (.*?\s.*?)\s
and it's taking the second word, but if the value doesn't have a second word is going to take the word Registrant
so I kind of need to stop on Registrant but don't understand exactly how.
Upvotes: 0
Views: 38
Reputation: 174696
It seems like your fields has exactly two words followed by :
on the second word. If yes then you may try the below regex.
: \K.*?(?= \S+ \S+:|$)
PHP code would be,
<?php
$data = 'Registrant Name: Domain Administrator Registrant Organization: Yahoo! Inc. Registrant Street: 701 First Avenue Registrant City: Sunnyvale';
$regex = '~: \K.*?(?= \S+ \S+:|$)~';
preg_match_all($regex, $data, $matches);
print_r($matches);
?>
Output:
Array
(
[0] => Array
(
[0] => Domain Administrator
[1] => Yahoo! Inc.
[2] => 701 First Avenue
[3] => Sunnyvale
)
)
Upvotes: 1
Reputation: 89557
Since you are using a lazy quantifier .*?
followed by a \s
the match will stop at the first whitespace character.
A way to solve the problem is to use the fact that .*?
must be followed by a space and the word "Registrant" or the end of the string:
/:\s(.*?)(?:\sRegistrant\b|\s*$)/
An other possible way is to use preg_split
:
$str = 'Registrant Name: Domain Administrator Registrant Organization: Yahoo! Inc. Registrant Street: 701 First Avenue Registrant City: Sunnyvale';
$pattern = '~\s*\bRegistrant[^:]+:\s*~';
$result = preg_split($pattern, $str, -1, PREG_SPLIT_NO_EMPTY);
Upvotes: 2