Reputation: 1941
I have an array of top level domains like:
['ag', 'asia', 'asia_sunrise', 'com', 'com.ag', 'org.hn']
Given a domain name, how can i extract the top level domain of the domain name based on the array above? Basically i dont care of how many levels the domain has, i only need to extract the top level domain.
For example:
test1.ag -> should return ag
test2.com.ag -> should return com.ag
test.test2.com.ag -> should return com.ag
test3.org -> should return false
Thanks
Upvotes: 0
Views: 1877
Reputation: 19002
$domains = ['ag', 'asia', 'asia_sunrise', 'com', 'com.ag', 'org.hn'];
$str = 'test.test2.com.ag'; //your string
preg_match('/\b('.str_replace('.', '\.', implode('|', $domains)).')$/', $str, $matches);
//replace . with \. because . is reserved in regex for any character
$result = $matches[0] ?: false;
Edit: added word boundary in regexp and $result is your string or false
Upvotes: 2
Reputation: 6037
using regexp is not realy needed, so should be avoided here.
function topDomain($url) {
$arr = ['ag', 'asia', 'asia_sunrise', 'com', 'hn'];
$tld = parse_url($url);
$toplevel = explode(".", $tld['path'] );
if(in_array(end($toplevel),$arr)){
return $url;
}
ps. 'com.ag' and 'org.hn' are not toplevel domains, but second level domains, so these were left out in the example.
Upvotes: 0
Reputation: 8472
Updated to incorporate Traxo's point about the .
wildcard; I think my answer is a little fuller so I'll leave it up but we've both essentially come to the same solution.
//set up test variables
$aTLDList = ['ag', 'asia', 'asia_sunrise', 'com', 'com.ag', 'org.hn'];
$sDomain = "badgers.co.uk"; // for example
//build the match
$reMatch = '/^.*?\.(' . str_replace('.', '\.', implode('|', $aTLDList)) . ')$/';
$sMatchedTLD = preg_match($reMatch, $sDomain) ?
preg_replace($reMatch, "$1", $sDomain) :
"";
Resorting to Regular Expressions may be overkill but it makes for a concise example. This will give you either the TLD matched or an empty string in the $sMatchedTLD
variable.
The trick is to make the first .*
match ungreedy (.*?
) otherwise badgers.com.ag will match ag rather than com.ag.
Upvotes: 1
Reputation: 9430
Firstly, you should provide an array sorted by length of similar domains, for example 'com.ag' before 'ag'. And then:
function get_domain($s){
$a = ['com.ag', 'ag', 'asia_sunrise', 'asia', 'com', 'org.hn'];
foreach($a as $v){
if(preg_match("/$v$/",$s)){// if it ends with the array's value
return $v;
}
}
return false;// if none matched the pattern, loop ends and returns false
}
echo get_domain('test.test2.com.ag');// com.ag
Upvotes: 0
Reputation: 78991
parseurl() function gives you access to the host name of the url. You can use that to process the host name and find out the tld.
$url = 'http://your.url.com.np';
var_dump(parse_url($url, PHP_URL_HOST));
Next steps could be using explode() to split the host name and checking the last item in the exploded list. But I am going to leave that to you.
Upvotes: 0