Reputation: 6501
I'm using this function to get domain and subdomain from a string. But If string is already my expected format, it returns null
function getDomainFromUrl($url) {
$host = parse_url($url, PHP_URL_HOST);
return preg_replace('/^www\./', '', $host);
}
$url = "http://abc.example.com/" -> abc.example.com | OK
$url = "http://www.example.com/" -> example.com | OK
$url = "abc.example.com" -> FAILS!
Upvotes: 5
Views: 1829
Reputation: 19182
Here's a pure regex solution:
function getDomainFromUrl($url) {
if (preg_match('/^(?:https?:\/\/)?(?:(?:[^@]*@)|(?:[^:]*:[^@]*@))?(?:www\.)?([^\/:]+)/', $url, $parts)) {
return $parts[1];
}
return false; // or maybe '', depending on what you need
}
getDomainFromUrl("http://abc.example.com/"); // abc.example.com
getDomainFromUrl("http://www.example.com/"); // example.com
getDomainFromUrl("abc.example.com"); // abc.example.com
getDomainFromUrl("[email protected]"); // abc.example.com
getDomainFromUrl("https://username:[email protected]"); // abc.example.com
getDomainFromUrl("https://username:[email protected]:123"); // abc.example.com
You can try it here: http://sandbox.onlinephpfunctions.com/code/3f0343bbb68b190bffff5d568470681c00b0c45c
In case you want to know more about the regex:
^ matching must start from the beginning on the string
(?:https?:\/\/)? an optional, non-capturing group that matches http:// and https://
(?:(?:[^@]*@)|(?:[^:]*:[^@]*@))?
an optional, non-capturing group that matches either *@ or *:*@ where * is any character
(?:www\.)? an optional, non-capturing group that matches www.
([^\/:]+) a capturing group that matches anything up until a '/', a ':', or the end of the string
Upvotes: 3
Reputation: 12689
parse_url() function doesn't work with relative URLs. You can test if the sheme is present, and if not add the default one:
if ( !preg_match( '/^([^\:]+)\:\/\//', $url ) ) $url = 'http://' . $url;
Upvotes: 0
Reputation: 4245
That is because abc.example.com
is not a PHP_URL_HOST
so you need to first check that it is one first. So you should do something simple like this, where if the url is doesn't have a protocol -> add it:
function addhttp($url) {
if (!preg_match("~^(?:f|ht)tps?://~i", $url)) {
$url = "http://" . $url;
}
return $url;
}
function getDomainFromUrl($url) {
$host = parse_url($url, PHP_URL_HOST);
if($host){
return preg_replace('/^www\./', '', $host);
}else{
//not a url with protocol
$url = addhttp($url); //add protocol
return getDomainFromUrl($url); //run function again.
}
}
Upvotes: 3
Reputation: 23892
The issue is that parse_url is returning false. Check to make sure you get a response before trying to use it otherwise $host
is empty.
<?php
function getDomainFromUrl($url) {
$host = (parse_url($url, PHP_URL_HOST) != '') ? parse_url($url, PHP_URL_HOST) : $url;
return preg_replace('/^www\./', '', $host);
}
echo getDomainFromUrl("http://abc.example.com/") . "\n";
echo getDomainFromUrl("http://www.example.com/") . "\n";
echo getDomainFromUrl("abc.example.com");
Output:
abc.example.com
example.com
abc.example.com
Upvotes: 1