Reputation: 395
I just made a regex pattern for replace links to HTML anchor tags, this is it:
~((http\:\/\/|https\:\/\/)([^ ]+)) ~
The reason why I ask this, is because I just finished this regex recently and made a few tests with some links, it works great but I want to be sure that there is no bugs with this pattern (I'm a regex newie) and maybe a regex expert could tell his opinion and / or suggestion.
By the way, if you're figuring out the space at the end, you may think it will not work if the string doesn't ends with a space, but my trick is to add that space to the string before the replacement and then remove it again once the stuff is done.
PD:
I don't take care of the link's validation itself, I just want to search for the strings that starts with http:// and ends with a space, nothing else, since link validation is a bit complicated.
EDIT:
Some of my code:
<?php
$patron = "~(https?:\/\/[^\s]+) ~";
//$patron = "~((http\:\/\/|https\:\/\/)([^ ]+)) ~";
$reemplazar = '<a href="$1">$1</a> ';
$cadena = "https://www.youtube.com/watch?v=7it5wioGixA ";
echo preg_replace($patron, $reemplazar, $cadena);
?>
Upvotes: 0
Views: 122
Reputation: 6606
I think this can be greatly simplified:
~(https?://\S+) ~
Other than that: Looks okay to me.
Upvotes: 2
Reputation: 89574
With the same idea, your pattern can be shorten to :
~https?://[^\s"'>]+~ # don't forget to escape the quote you use.
To change URLs to links:
$html = preg_replace_callback('~\b(?:(https?://)|www\.)[^]\s"\')<]++~',
function ($m) {
$pre = ($m[1]) ? $m[1] : 'http://';
if (filter_var($pre . $m[0], FILTER_VALIDATE_URL))
return '<a href="' . $m[0] . '">' . $m[0] . '</a>';
else return $m[0];
}, $html);
Old answer:
To change URLs inside links:
A better way to extract all href attributes from all "a" tags is to use the DOM.
$doc = new DOMDocument();
@$doc->loadHTML($htmlString);
$links = $doc->getElementsByTagName('href');
foreach($links as &$link) {
$href = $link->getAttribute('href');
$link->setAttribute('href', 'what you want');
}
Upvotes: 1