lukemcd
lukemcd

Reputation: 1501

regex to create link from url and strip www

I have a PHP function which takes a passed url and creates a clean link. It puts the full link in the anchor tags and presents just "www.domain.com" from the url. It works well but I would like to modify it so it strips out the "www." part as well.

<?php
    // pass a url like: http://www.yelp.com/biz/my-business-name
    // should return: <a href="http://www.yelp.com/biz/my-business-name">yelp.com</a>
    function formatURL($url, $target=FALSE) {
        if ($target) { $anchor_tag = "<a href=\"\\0\" target=\"$target\">\\4</a>"; }
        else { $anchor_tag = "<a href=\"\\0\">\\4</a>"; }
        $return_link = preg_replace("`(http|ftp)+(s)?:(//)((\w|\.|\-|_)+)(/)?(\S+)?`i", $anchor_tag, $url);
        return $return_link;
    }
?>

My regex skills are not that strong so any help greatly appreciated.

Upvotes: 1

Views: 650

Answers (2)

majic bunnie
majic bunnie

Reputation: 1405

Take a look at parse_url: https://www.php.net/manual/en/function.parse-url.php

This will simplify your logic quite a bit can can make replacing the www. a simple string replace.

$link = 'http://www.yelp.com/biz/my-business-name';
$hostname = parse_url($link, PHP_URL_HOST));
if(strpos($hostname, 'www.') === 0)
{
   $hostname = substr($hostname, 4);
}

I have modified my original answer to account for the issue in the comments. The preg_replace in the post below will also work and is a bit more concise, I will leave this here to show an alternative solution that does not require invoking the regex engine if you desire.

Upvotes: 5

Manse
Manse

Reputation: 38147

This will get your the Domain name minus the www :

$url = preg_replace('/^www./', '', parse_url($url, PHP_URL_HOST));

^ in the regex means only remove www from the start of the string

Working example : http://codepad.org/FTNikw8g

Upvotes: 3

Related Questions