Reputation: 493
I have a bunch of urls like these.
$urls = array(
'https://site1.com',
'https://www.site2.com',
'http://www.site3.com',
'https://site4.com',
'site5.com',
'www.site6.com',
'www.site7.co.uk',
'site8.tk'
);
I wanted to remove the http, https, :// and www. from these strings so that the output will look like these.
$urls = array(
'site1.com',
'site2.com',
'site3.com',
'site4.com',
'site5.com',
'site6.com',
'site7.co.uk',
'site8.tk'
);
I came up with this solution.
foreach ($urls as $url) {
$pattern = '/(http[s]?:\/\/)?(www\.)?/i';
$replace = "";
echo "before: $url after: ".preg_replace('/\/$/', '', preg_replace($pattern, $replace, $url))."\n";
}
I was wondering how I could avoid the second preg_replace. Any ideas?
Upvotes: 3
Views: 4245
Reputation: 101926
Depending on what exactly it is you want to do, it might be better to stick with PHP's own URL parsing facilities, namely parse_url
:
foreach ($urls as &$url) {
$url = preg_replace('~^www.~', '', parse_url($url, PHP_URL_HOST));
}
unset($url);
parse_url
will give you the host of the URL, even if it will contain a port number or HTTP authentication data. (Whether this is what you need, depends on your exact use case though.)
Upvotes: 0
Reputation: 154543
Short and sweet:
$urls = preg_replace('~^(?:https?://)?(?:www[.])?~i', '', $urls);
Upvotes: 0
Reputation: 141829
preg_replace can also take an array, so you don't even need the loop. You can do this with a one liner:
$urls = preg_replace('/(?:https?:\/\/)?(?:www\.)?(.*)\/?$/i', '$1', $urls);
Upvotes: 14
Reputation: 59287
/^(https?:\/\/)?(www\.)?(.*)\/$/i
And use what's on $3
. Or, even better, change the first two parentheses to the non-capturing version (?:)
and use what's on 1.
Upvotes: 13