Reputation: 24886
I have a simple commenting system where people can submit hyperlinks inside the plain text field. When I display these records back from the database and into the web page, what RegExp in PHP can I use to convert these links into HTML-type anchor links?
I don't want the algorithm to do this with any other kind of link, just http and https.
Upvotes: 67
Views: 110714
Reputation: 1021
Here is another solution, This will catch all http/https/www and convert to clickable links.
$url = '~(?:(https?)://([^\s<]+)|(www\.[^\s<]+?\.[^\s<]+))(?<![\.,:])~i';
$string = preg_replace($url, '<a href="$0" target="_blank" title="$0">$0</a>', $string);
echo $string;
Alternatively for just catching http/https then use the code below.
$url = '/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/';
$string= preg_replace($url, '<a href="$0" target="_blank" title="$0">$0</a>', $string);
echo $string;
EDIT: The script below will catch all URL types and convert them to clickable links.
$url = '@(http)?(s)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])@';
$string = preg_replace($url, '<a href="http$2://$4" target="_blank" title="$0">$0</a>', $string);
echo $string;
The new update, If you're having the string strip the (s) then use the below code block, Thanks to @AndrewEllis for pointing this out.
$url = '@(http(s)?)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])@';
$string = preg_replace($url, '<a href="http$2://$4" target="_blank" title="$0">$0</a>', $string);
echo $string;
Here's a very simple solution for the URL not displaying correctly.
$email = '<a href="mailto:[email protected]">[email protected]</a>';
$string = $email;
echo $string;
It is a very simple fix but you will have to modify it for your own purpose.
I've provided multiple answers as some servers are set up differently, so one answer may work for some but not for others, but I hope the answer(s) work for you and if not then let me know, and hopefully, I can come up with another solution.
There are multiple scripts as some PHP files require different scripts also some servers are set up differently, Plus each has different requirements, Some want just HTTP/S, some want WWW and some want FTP/S, Each one will work depending on how the users own scripts are set up, I provided some text with each one with what they do.
Upvotes: 89
Reputation: 2513
* Turn all URLs in clickable links.
* @param string $value
* @param array $protocols http/https, ftp, mail, twitter
* @param array $attributes
* @return string
public function linkify($value, $protocols = array('http', 'mail'), array $attributes = array())
// Link attributes
$attr = '';
foreach ($attributes as $key => $val) {
$attr .= ' ' . $key . '="' . htmlentities($val) . '"';
$links = array();
// Extract existing links and tags
$value = preg_replace_callback('~(<a .*?>.*?</a>|<.*?>)~i', function ($match) use (&$links) { return '<' . array_push($links, $match[1]) . '>'; }, $value);
// Extract text links for each protocol
foreach ((array)$protocols as $protocol) {
switch ($protocol) {
case 'http':
case 'https': $value = preg_replace_callback('~(?:(https?)://([^\s<]+)|(www\.[^\s<]+?\.[^\s<]+))(?<![\.,:])~i', function ($match) use ($protocol, &$links, $attr) { if ($match[1]) $protocol = $match[1]; $link = $match[2] ?: $match[3]; return '<' . array_push($links, "<a $attr href=\"$protocol://$link\">$link</a>") . '>'; }, $value); break;
case 'mail': $value = preg_replace_callback('~([^\s<]+?@[^\s<]+?\.[^\s<]+)(?<![\.,:])~', function ($match) use (&$links, $attr) { return '<' . array_push($links, "<a $attr href=\"mailto:{$match[1]}\">{$match[1]}</a>") . '>'; }, $value); break;
case 'twitter': $value = preg_replace_callback('~(?<!\w)[@#](\w++)~', function ($match) use (&$links, $attr) { return '<' . array_push($links, "<a $attr href=\"" . ($match[0][0] == '@' ? '' : 'search/%23') . $match[1] . "\">{$match[0]}</a>") . '>'; }, $value); break;
default: $value = preg_replace_callback('~' . preg_quote($protocol, '~') . '://([^\s<]+?)(?<![\.,:])~i', function ($match) use ($protocol, &$links, $attr) { return '<' . array_push($links, "<a $attr href=\"$protocol://{$match[1]}\">{$match[1]}</a>") . '>'; }, $value); break;
// Insert all link
return preg_replace_callback('/<(\d+)>/', function ($match) use (&$links) { return $links[$match[1] - 1]; }, $value);
Not my code, I got it from here
Upvotes: 2
Reputation: 2455
I really liked this answer - yet I needed a solution for possible plain text links that are inside very simple HTML text:
<p>I found a really cool site you might like:</p>
This meant I needed the regex patterns to ignore the html chars <
and >
So I changed parts of the patterns to [^\s\>\<]
instead of \S
- not white-space; matches any char that is not white-space (tab, space, newline)[^]
- a negated set; matches any char not in the setI needed another format in addition to HTML so I separated out the regexes from their replacements to accommodate this.
I also added a way to return just the links/emails found into an array so I can save them as a relationship on my posts (great for making meta cards for them later ...and for analytics!).
I was getting matches for text like
- So I wanted to ensure I didn't get any matches that included consecutive dots.
Note: To accomplish fixing this, I added an additional format string to undo matching them to avoid having to redo these otherwise reliable url regexes.
* based on this answer:
* @var $text String
* @var $format String - html (<a href=""...), short ([link:https://somewhere]), other (https://somewhere)
public function formatLinksInString(
$format = 'html',
$returnMatches = false
) {
$formatProtocol = $format == 'html'
? '<a href="$0" target="_blank" title="$0">$0</a>'
: ($format == 'short' || $returnMatches ? '[link:$0]' : '$0');
$formatSansProtocol = $format == 'html'
? '<a href="//$0" target="_blank" title="$0">$0</a>'
: ($format == 'short' || $returnMatches ? '[link://$0]' : '$0');
$formatMailto = $format == 'html'
? '<a href="mailto:$1" target="_blank" title="$1">$1</a>'
: ($format == 'short' || $returnMatches ? '[mailto:$1]' : '$1');
$regProtocol = '/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\/[^\<\>\s]*)?/';
$regSansProtocol = '/(?<=\s|\A|\>)([0-9a-zA-Z\-\.]+\.[a-zA-Z0-9\/]{2,})(?=\s|$|\,|\<)/';
$regEmail = '/([^\s\>\<]+\@[^\s\>\<]+\.[^\s\>\<]+)\b/';
$consecutiveDotsRegex = $format == 'html'
? '/<a[^\>]+[\.]{2,}[^\>]*?>([^\<]*?)<\/a>/'
: '/\[link:.*?\/\/([^\]]+[\.]{2,}[^\]]*?)\]/';
// Protocol links
$formatString = preg_replace($regProtocol, $formatProtocol, $string);
// Sans Protocol Links
$formatString = preg_replace($regSansProtocol, $formatSansProtocol, $formatString); // use formatString from above
// Email - Mailto - Links
$formatString = preg_replace($regEmail, $formatMailto, $formatString); // use formatString from above
// Prevent consecutive periods from getting captured
$formatString = preg_replace($consecutiveDotsRegex, '$1', $formatString);
if ($returnMatches) {
// Find all [x:link] patterns
preg_match_all('/\[.*?:(.*?)\]/', $formatString, $matches);
current($matches); // to move pointer onto groups
return next($matches); // return the groups
return $formatString;
Upvotes: 1
Reputation: 2017
Here is my code to format all the links inside text, including emails, urls with and without protocol.
public function formatLinksInText($text)
//Catch all links with protocol
$reg = '/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\/\S*)?/';
$formatText = preg_replace($reg, '<a href="$0" style="font-weight: normal;" target="_blank" title="$0">$0</a>', $text);
//Catch all links without protocol
$reg2 = '/(?<=\s|\A)([0-9a-zA-Z\-\.]+\.[a-zA-Z0-9\/]{2,})(?=\s|$|\,|\.)/';
$formatText = preg_replace($reg2, '<a href="//$0" style="font-weight: normal;" target="_blank" title="$0">$0</a>', $formatText);
//Catch all emails
$emailRegex = '/(\S+\@\S+\.\S+)\b/';
$formatText = preg_replace($emailRegex, '<a href="mailto:$1" style="font-weight: normal;" target="_blank" title="$1">$1</a>', $formatText);
$formatText = nl2br($formatText);
return $formatText;
Please comment the url that doesn't work. I'll try to update the regex.
Upvotes: 5
Reputation: 8667
$string = '';
preg_match_all('#(\w*://|www\.)[a-z0-9]+(-+[a-z0-9]+)*(\.[a-z0-9]+(-+[a-z0-9]+)*)+(/([^\s()<>;]+\w)?/?)?#i', $string, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
foreach (array_reverse($matches) as $match) {
$a = '<a href="'.(strpos($match[1][0], '/') ? '' : 'http://') . $match[0][0].'">' . $match[0][0] . '</a>';
$string = substr_replace($string, $a, $match[0][1], strlen($match[0][0]));
echo $string;
<a href=""></a>
<a href=""></a>
<a href=""></a>
<a href=""></a>
<a href=""></a>
What I like in this solution is that it also converts
because <a href=""></a>
doesn't work (without http/https
protocol it points to
Upvotes: 0
Reputation: 3434
Try this one:
$s = preg_replace('/(?<!href="|">)(?<!src=\")((http|ftp)+(s)?:\/\/[^<>\s]+)/is', '<a href="\\1" target="_blank">\\1</a>', $s);
It skips the existing links (if we already have a href, it won't add a href inside of a href). Otherwise it will add the a href with blank target.
Upvotes: 3
If am right, what you want to do is turn ordinary text into http links. Here's what I think can help:
$list = mysqli_query($con,"SELECT * FROM list WHERE name = 'table content'");
while($row2 = mysqli_fetch_array($list)) {
echo "<a target='_blank' href='http://www." . $row2['content']. "'>" . $row2['content']. "</a>";
Upvotes: -2
Reputation: 15186
I am using a function that originated from question2answer, it accepts plain text and even plain text links in html:
// $html holds the string
$htmlunlinkeds = array_reverse(preg_split('|<[Aa]\s+[^>]+>.*</[Aa]\s*>|', $html, -1, PREG_SPLIT_OFFSET_CAPTURE)); // start from end so we substitute correctly
foreach ($htmlunlinkeds as $htmlunlinked)
{ // and that we don't detect links inside HTML, e.g. <img src="http://...">
$thishtmluntaggeds = array_reverse(preg_split('/<[^>]*>/', $htmlunlinked[0], -1, PREG_SPLIT_OFFSET_CAPTURE)); // again, start from end
foreach ($thishtmluntaggeds as $thishtmluntagged)
$innerhtml = $thishtmluntagged[0];
if(is_numeric(strpos($innerhtml, '://')))
{ // quick test first
$newhtml = qa_html_convert_urls($innerhtml, qa_opt('links_in_new_window'));
$html = substr_replace($html, $newhtml, $htmlunlinked[1]+$thishtmluntagged[1], strlen($innerhtml));
echo $html;
function qa_html_convert_urls($html, $newwindow = false)
Return $html with any URLs converted into links (with nofollow and in a new window if $newwindow).
Closing parentheses/brackets are removed from the link if they don't have a matching opening one. This avoids creating
incorrect URLs from ( but allow URLs such as
$uc = 'a-z\x{00a1}-\x{ffff}';
$url_regex = '#\b((?:https?|ftp)://(?:[0-9'.$uc.'][0-9'.$uc.'-]*\.)+['.$uc.']{2,}(?::\d{2,5})?(?:/(?:[^\s<>]*[^\s<>\.])?)?)#iu';
// get matches and their positions
if (preg_match_all($url_regex, $html, $matches, PREG_OFFSET_CAPTURE)) {
$brackets = array(
')' => '(',
'}' => '{',
']' => '[',
// loop backwards so we substitute correctly
for ($i = count($matches[1])-1; $i >= 0; $i--) {
$match = $matches[1][$i];
$text_url = $match[0];
$removed = '';
$lastch = substr($text_url, -1);
// exclude bracket from link if no matching bracket
while (array_key_exists($lastch, $brackets)) {
$open_char = $brackets[$lastch];
$num_open = substr_count($text_url, $open_char);
$num_close = substr_count($text_url, $lastch);
if ($num_close == $num_open + 1) {
$text_url = substr($text_url, 0, -1);
$removed = $lastch . $removed;
$lastch = substr($text_url, -1);
$target = $newwindow ? ' target="_blank"' : '';
$replace = '<a href="' . $text_url . '" rel="nofollow"' . $target . '>' . $text_url . '</a>' . $removed;
$html = substr_replace($html, $replace, $match[1], strlen($match[0]));
return $html;
A bit much code due to accepting links that hold brackets and other characters, but probably it helps.
Upvotes: 1
Reputation: 990
The most rated answer didn't do the job for me, following link was not replaced correctly:
After some google searches and some tests, this is what I came up with:
public static function replaceLinks($s) {
return preg_replace('@(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.%-=#]*(\?\S+)?)?)?)@', '<a href="$1">$1</a>', $s);
I'm not an expert in regex, actually it quite confuses me :)
So feel free to comment and improve this solution.
Upvotes: 7
Reputation: 101
Refer This is how wordpress solve it
function _make_url_clickable_cb($matches) {
$ret = '';
$url = $matches[2];
if ( empty($url) )
return $matches[0];
// removed trailing [.,;:] from URL
if ( in_array(substr($url, -1), array('.', ',', ';', ':')) === true ) {
$ret = substr($url, -1);
$url = substr($url, 0, strlen($url)-1);
return $matches[1] . "<a href=\"$url\" rel=\"nofollow\">$url</a>" . $ret;
function _make_web_ftp_clickable_cb($matches) {
$ret = '';
$dest = $matches[2];
$dest = 'http://' . $dest;
if ( empty($dest) )
return $matches[0];
// removed trailing [,;:] from URL
if ( in_array(substr($dest, -1), array('.', ',', ';', ':')) === true ) {
$ret = substr($dest, -1);
$dest = substr($dest, 0, strlen($dest)-1);
return $matches[1] . "<a href=\"$dest\" rel=\"nofollow\">$dest</a>" . $ret;
function _make_email_clickable_cb($matches) {
$email = $matches[2] . '@' . $matches[3];
return $matches[1] . "<a href=\"mailto:$email\">$email</a>";
function make_clickable($ret) {
$ret = ' ' . $ret;
// in testing, using arrays here was found to be faster
$ret = preg_replace_callback('#([\s>])([\w]+?://[\w\\x80-\\xff\#$%&~/.\-;:=,?@\[\]+]*)#is', '_make_url_clickable_cb', $ret);
$ret = preg_replace_callback('#([\s>])((www|ftp)\.[\w\\x80-\\xff\#$%&~/.\-;:=,?@\[\]+]*)#is', '_make_web_ftp_clickable_cb', $ret);
$ret = preg_replace_callback('#([\s>])([.0-9a-z_+-]+)@(([0-9a-z-]+\.)+[0-9a-z]{2,})#i', '_make_email_clickable_cb', $ret);
// this one is not in an array because we need it to run last, for cleanup of accidental links within links
$ret = preg_replace("#(<a( [^>]+?>|>))<a [^>]+?>([^>]+?)</a></a>#i", "$1$3</a>", $ret);
$ret = trim($ret);
return $ret;
Upvotes: 10
Reputation: 338
The answer from MkVal works but in the case we already have the anchor link, it will render the text in weird format.
Here is the solution which works for me in both cases:
$s = preg_replace (
"/(?<!a href=\")(?<!src=\")((http|ftp)+(s)?:\/\/[^<>\s]+)/i",
"<a href=\"\\0\" target=\"blank\">\\0</a>",
Upvotes: 3
Reputation: 1064
Well, Volomike's answer is much closer. And to push it a bit further, here's what I did for it to disregard the trailing period at the end of the hyperlinks. I also considered URI fragments.
public static function makeClickableLinks($s) {
return preg_replace('@(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1" target="_blank">$1</a>', $s);
Upvotes: 41
Reputation: 24886
public static function makeClickableLinks($s) {
return preg_replace('@(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.-]*(\?\S+)?)?)?)@', '<a href="$1">$1</a>', $s);
Upvotes: 2
Reputation: 3997
I recommend not to do many things on fly like this. I prefer to use simple editor interface like the one used in stackoverflow. It is called Markdown.
Upvotes: 1
Reputation: 29267
function makeClickableLinks($text)
$text = html_entity_decode($text);
$text = " ".$text;
$text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
'<a href="\\1" target=_blank>\\1</a>', $text);
$text = eregi_replace('(((f|ht){1}tps://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
'<a href="\\1" target=_blank>\\1</a>', $text);
$text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
'\\1<a href="http://\\2" target=_blank>\\2</a>', $text);
$text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})',
'<a href="mailto:\\1" target=_blank>\\1</a>', $text);
return $text;
// Example Usage
echo makeClickableLinks("This is a test clickable link: You can also try using an email address like [email protected]");
Upvotes: 7