samsamm777
samsamm777

Reputation: 1686

Regular Expression URLS to Links

Im using the following regex to convert urls to href links. It works great, however ive found a bug with it when using style tags which have a background image.

    /**
 * Convert urls in a string to a html link
 * @return string
 */
public static function ConvertUrlsToHtml($str)
{
    $str = preg_replace( '@(?<![.*">])\b(?:(?:https?|ftp|file)://|[a-z]\.)[-A-Z0-9+&#/%=~_|$?!:,.]*[A-Z0-9+&#/%=~_|$]@i', '<a href="\0">\0</a>', $str);
    return $str;
}

If i use the following...

<div class="inner-left" style="background-image: url(http://www.somewebsite/background.jpg);"></div>

It converts the background image to a href too.

Does anyone know how i can tweak the regex to ignore the style tags?

Upvotes: 0

Views: 107

Answers (1)

Tchoupi
Tchoupi

Reputation: 14681

You can start by removing HTML tags, because you don't want to replace URLs inside tags. It is true for style=, it is also true for <img src=... and <a href=...> and so on.

function ConvertUrlsToHtml($str)
{
  $strNoTags = strip_tags($str);

  if (preg_match_all( '@(?<![.*">])\b(?:(?:https?|ftp|file)://|[a-z]\.)[-A-Z0-9+&#/%=~_|$?!:,.]*[A-Z0-9+&#/%=~_|$]@i', $strNoTags, $matches)) {

    foreach ($matches[0] as $match) {
      $str = str_replace($match, "<a href=\"$match\">$match</a>", $str);
    }
  }

  return $str;
}

What it does:

  1. Remove the tags
  2. Get all URL in the tag free string
  3. Replace found URLs by a link in the original string

As it was commented, you could always try a HTML parser first to extract the text instead of strip_tags.

Upvotes: 1

Related Questions