Reputation: 7994
Here is a design though: For example is I put a link such as
in textarea. How do I get PHP to detect it’s a http://
link and then print it as
print "<a href='http://www.example.com'>http://www.example.com</a>";
I remember doing something like this before however, it was not fool proof it kept breaking for complex links.
Another good idea would be if you have a link such as
http://example.com/test.php?val1=bla&val2blablabla%20bla%20bla.bl
fix it so it does
print "<a href='http://example.com/test.php?val1=bla&val2=bla%20bla%20bla.bla'>";
print "http://example.com/test.php";
print "</a>";
This one is just an after thought.. stackoverflow could also probably use this as well :D
Any Ideas
Upvotes: 58
Views: 73830
Reputation: 839
This is just a variation of the solution posted by Dharmendra Jadon, so if you like it up vote his instead!
I just added a parameter to make opening the link in a new window (target="_blank") optional, as I saw this in some of the other solutions and liked the flexibility:
function MakeUrls($str, $popup = FALSE)
{
$find=array('`((?:https?|ftp)://\S+[[:alnum:]]/?)`si','`((?<!//)(www\.\S+[[:alnum:]]/?))`si');
$replace=array('<a href="$1"' . ($popup ? ' target="_blank"' : '') . '>$1</a>', '<a href="http://$1"' . ($popup ? ' target="_blank"' : '') . '>$1</a>');
return preg_replace($find,$replace,$str);
}
Upvotes: -1
Reputation: 1
This class I created works for my needs, admittedly it does needs some work though;
class addLink
{
public function link($string)
{
$expression = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,63}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
if(preg_match_all($expression, $string, $matches) == 1)// If the pattern is found then
{
$string = preg_replace($expression, '<a href="'.$matches[0][0].'" target="_blank">$1</a>', $string);
}
return $string;
}
}
An example of using this code;
include 'PHP/addLink.php';
if(class_exists('addLink'))
{
$al = new addLink();
}
else{
echo 'Class not found...';
}
$paragraph = $al->link($paragraph);
Upvotes: -1
Reputation: 8751
Let's look at the requirements. You have some user-supplied plain text, which you want to display with hyperlinked URLs.
Edit: Check out GitHub for the latest version, with support for email addresses, authenticated URLs, URLs in quotes and parentheses, HTML input, as well as an updated TLD list.
Here's my take:
<?php
$text = <<<EOD
Here are some URLs:
stackoverflow.com/questions/1188129/pregreplace-to-detect-html-php
Here's the answer: http://www.google.com/search?rls=en&q=42&ie=utf-8&oe=utf-8&hl=en. What was the question?
A quick look at http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax is helpful.
There is no place like 127.0.0.1! Except maybe http://news.bbc.co.uk/1/hi/england/surrey/8168892.stm?
Ports: 192.168.0.1:8080, https://example.net:1234/.
Beware of Greeks bringing internationalized top-level domains: xn--hxajbheg2az3al.xn--jxalpdlp.
And remember.Nobody is perfect.
<script>alert('Remember kids: Say no to XSS-attacks! Always HTML escape untrusted input!');</script>
EOD;
$rexProtocol = '(https?://)?';
$rexDomain = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';
$rexPort = '(:[0-9]{1,5})?';
$rexPath = '(/[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]*?)?';
$rexQuery = '(\?[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
$rexFragment = '(#[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
// Solution 1:
function callback($match)
{
// Prepend http:// if no protocol specified
$completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";
return '<a href="' . $completeUrl . '">'
. $match[2] . $match[3] . $match[4] . '</a>';
}
print "<pre>";
print preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",
'callback', htmlspecialchars($text));
print "</pre>";
Edit: The following code fixes the above two problems, but is quite a bit more verbose since I'm more or less re-implementing preg_replace_callback
using preg_match
.
// Solution 2:
$validTlds = array_fill_keys(explode(" ", ".aero .asia .biz .cat .com .coop .edu .gov .info .int .jobs .mil .mobi .museum .name .net .org .pro .tel .travel .ac .ad .ae .af .ag .ai .al .am .an .ao .aq .ar .as .at .au .aw .ax .az .ba .bb .bd .be .bf .bg .bh .bi .bj .bm .bn .bo .br .bs .bt .bv .bw .by .bz .ca .cc .cd .cf .cg .ch .ci .ck .cl .cm .cn .co .cr .cu .cv .cx .cy .cz .de .dj .dk .dm .do .dz .ec .ee .eg .er .es .et .eu .fi .fj .fk .fm .fo .fr .ga .gb .gd .ge .gf .gg .gh .gi .gl .gm .gn .gp .gq .gr .gs .gt .gu .gw .gy .hk .hm .hn .hr .ht .hu .id .ie .il .im .in .io .iq .ir .is .it .je .jm .jo .jp .ke .kg .kh .ki .km .kn .kp .kr .kw .ky .kz .la .lb .lc .li .lk .lr .ls .lt .lu .lv .ly .ma .mc .md .me .mg .mh .mk .ml .mm .mn .mo .mp .mq .mr .ms .mt .mu .mv .mw .mx .my .mz .na .nc .ne .nf .ng .ni .nl .no .np .nr .nu .nz .om .pa .pe .pf .pg .ph .pk .pl .pm .pn .pr .ps .pt .pw .py .qa .re .ro .rs .ru .rw .sa .sb .sc .sd .se .sg .sh .si .sj .sk .sl .sm .sn .so .sr .st .su .sv .sy .sz .tc .td .tf .tg .th .tj .tk .tl .tm .tn .to .tp .tr .tt .tv .tw .tz .ua .ug .uk .us .uy .uz .va .vc .ve .vg .vi .vn .vu .wf .ws .ye .yt .yu .za .zm .zw .xn--0zwm56d .xn--11b5bs3a9aj6g .xn--80akhbyknj4f .xn--9t4b11yi5a .xn--deba0ad .xn--g6w251d .xn--hgbk6aj7f53bba .xn--hlcj6aya9esc7a .xn--jxalpdlp .xn--kgbechtv .xn--zckzah .arpa"), true);
$position = 0;
while (preg_match("{\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))}", $text, &$match, PREG_OFFSET_CAPTURE, $position))
{
list($url, $urlPosition) = $match[0];
// Print the text leading up to the URL.
print(htmlspecialchars(substr($text, $position, $urlPosition - $position)));
$domain = $match[2][0];
$port = $match[3][0];
$path = $match[4][0];
// Check if the TLD is valid - or that $domain is an IP address.
$tld = strtolower(strrchr($domain, '.'));
if (preg_match('{\.[0-9]{1,3}}', $tld) || isset($validTlds[$tld]))
{
// Prepend http:// if no protocol specified
$completeUrl = $match[1][0] ? $url : "http://$url";
// Print the hyperlink.
printf('<a href="%s">%s</a>', htmlspecialchars($completeUrl), htmlspecialchars("$domain$port$path"));
}
else
{
// Not a valid URL.
print(htmlspecialchars($url));
}
// Continue text parsing from after the URL.
$position = $urlPosition + strlen($url);
}
// Print the remainder of the text.
print(htmlspecialchars(substr($text, $position)));
Upvotes: 124
Reputation: 1
This worked for me (turned one of the answers into a PHP function)
function make_urls_from_text ($text){
return preg_replace('/(http[s]{0,1}\:\/\/\S{4,})\s{0,}/ims', '<a href="$1" target="_blank">$1 </a>', $text);
}
Upvotes: -1
Reputation: 1546
As I mentioned in one of the comments above my VPS, which is running php 7, started emitting warnings Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead. The buffer after the replacement was empty/false.
I have rewritten the code and made some improvements. If you think that you should be in the author section feel free to edit the comment above the function make_links_blank name. I am intentionally not using the closing php ?> to avoid inserting whitespace in the output.
<?php
class App_Updater_String_Util {
public static function get_default_link_attribs( $regex_matches = [] ) {
$t = ' target="_blank" ';
return $t;
}
/**
* App_Updater_String_Util::set_protocol();
* @param string $link
* @return string
*/
public static function set_protocol( $link ) {
if ( ! preg_match( '#^https?#si', $link ) ) {
$link = 'http://' . $link;
}
return $link;
}
/**
* Goes through text and makes whatever text that look like a link an html link
* which opens in a new tab/window (by adding target attribute).
*
* Usage: App_Updater_String_Util::make_links_blank( $text );
*
* @param str $text
* @return str
* @see http://stackoverflow.com/questions/1188129/replace-urls-in-text-with-html-links
* @author Angel.King.47 | http://dashee.co.uk
* @author Svetoslav Marinov (Slavi) | http://orbisius.com
*/
public static function make_links_blank( $text ) {
$patterns = [
'#(?(?=<a[^>]*>.+?<\/a>)
(?:<a[^>]*>.+<\/a>)
|
([^="\']?)((?:https?|ftp):\/\/[^<> \n\r]+)
)#six' => function ( $matches ) {
$r1 = empty( $matches[1] ) ? '' : $matches[1];
$r2 = empty( $matches[2] ) ? '' : $matches[2];
$r3 = empty( $matches[3] ) ? '' : $matches[3];
$r2 = empty( $r2 ) ? '' : App_Updater_String_Util::set_protocol( $r2 );
$res = ! empty( $r2 ) ? "$r1<a href=\"$r2\">$r2</a>$r3" : $matches[0];
$res = stripslashes( $res );
return $res;
},
'#(^|\s)((?:https?://|www\.|https?://www\.)[^<>\ \n\r]+)#six' => function ( $matches ) {
$r1 = empty( $matches[1] ) ? '' : $matches[1];
$r2 = empty( $matches[2] ) ? '' : $matches[2];
$r3 = empty( $matches[3] ) ? '' : $matches[3];
$r2 = ! empty( $r2 ) ? App_Updater_String_Util::set_protocol( $r2 ) : '';
$res = ! empty( $r2 ) ? "$r1<a href=\"$r2\">$r2</a>$r3" : $matches[0];
$res = stripslashes( $res );
return $res;
},
// Remove any target attribs (if any)
'#<a([^>]*)target="?[^"\']+"?#si' => '<a\\1',
// Put the target attrib
'#<a([^>]+)>#si' => '<a\\1 target="_blank">',
// Make emails clickable Mailto links
'/(([\w\-]+)(\\.[\w\-]+)*@([\w\-]+)
(\\.[\w\-]+)*)/six' => function ( $matches ) {
$r = $matches[0];
$res = ! empty( $r ) ? "<a href=\"mailto:$r\">$r</a>" : $r;
$res = stripslashes( $res );
return $res;
},
];
foreach ( $patterns as $regex => $callback_or_replace ) {
if ( is_callable( $callback_or_replace ) ) {
$text = preg_replace_callback( $regex, $callback_or_replace, $text );
} else {
$text = preg_replace( $regex, $callback_or_replace, $text );
}
}
return $text;
}
}
Upvotes: 1
Reputation: 141
Here is the code using Regular Expressions in function
<?php
//Function definations
function MakeUrls($str)
{
$find=array('`((?:https?|ftp)://\S+[[:alnum:]]/?)`si','`((?<!//)(www\.\S+[[:alnum:]]/?))`si');
$replace=array('<a href="$1" target="_blank">$1</a>', '<a href="http://$1" target="_blank">$1</a>');
return preg_replace($find,$replace,$str);
}
//Function testing
$str="www.cloudlibz.com";
$str=MakeUrls($str);
echo $str;
?>
Upvotes: 4
Reputation: 2827
I've been using this function, it works for me
function AutoLinkUrls($str,$popup = FALSE){
if (preg_match_all("#(^|\s|\()((http(s?)://)|(www\.))(\w+[^\s\)\<]+)#i", $str, $matches)){
$pop = ($popup == TRUE) ? " target=\"_blank\" " : "";
for ($i = 0; $i < count($matches['0']); $i++){
$period = '';
if (preg_match("|\.$|", $matches['6'][$i])){
$period = '.';
$matches['6'][$i] = substr($matches['6'][$i], 0, -1);
}
$str = str_replace($matches['0'][$i],
$matches['1'][$i].'<a href="http'.
$matches['4'][$i].'://'.
$matches['5'][$i].
$matches['6'][$i].'"'.$pop.'>http'.
$matches['4'][$i].'://'.
$matches['5'][$i].
$matches['6'][$i].'</a>'.
$period, $str);
}//end for
}//end if
return $str;
}//end AutoLinkUrls
All credits goes to - http://snipplr.com/view/68586/
Enjoy!
Upvotes: 4
Reputation: 6013
You guyz are talking way to advance and complex stuff which is good for some situation, but mostly we need a simple careless solution. How about simply this?
preg_replace('/(http[s]{0,1}\:\/\/\S{4,})\s{0,}/ims', '<a href="$1" target="_blank">$1</a> ', $text_msg);
Just try it and let me know what crazy url it doesnt satisfy.
Upvotes: 18
Reputation: 2643
If you want to trust the IANA you can get your current list of offcially supported TLDs in use there like:
$validTLDs =
explode("\n", file_get_contents('http://data.iana.org/TLD/tlds-alpha-by-domain.txt')); //get the official list of valid tlds
array_shift($validTLDs); //throw away first line containing meta data
array_pop($validTLDs); //throw away last element which is empty
Makes Søren Løvborg's solution #2 a bit less verbose and spares you the hassle of updating the list, nowadays new tlds are thrown out so carelessly ;)
Upvotes: 0
Reputation: 461
This class
changes the urls into text and while keeping the home url as it is. I hope this will help and save time for you.Enjoy.
class RegClass
{
function preg_callback_url($matches)
{
//var_dump($matches);
//Get the matched URL text <a>text</a>
$text = $matches[2];
//Get the matched URL link <a href ="http://www.test.com">text</a>
$url = $matches[1];
if($url=='href ="http://www.test.com"'){
//replace all a tag as it is
return '<a href='.$url.' rel="nofollow"> '.$text.' </a>';
}else{
//replace all a tag to text
return " $text " ;
}
}
function ParseText($text){
$text = preg_replace( "/www\./", "http://www.", $text );
$regex ="/http:\/\/http:\/\/www\./"
$text = preg_replace( $regex, "http://www.", $text );
$regex2 = "/https:\/\/http:\/\/www\./";
$text = preg_replace( $regex2, "https://www.", $text );
return preg_replace_callback('/<a\s(.+?)>(.+?)<\/a>/is',
array( &$this, 'preg_callback_url'), $text);
}
}
$regexp = new RegClass();
echo $regexp->ParseText($text);
Upvotes: -1
Reputation: 25200
I know this answer has been accepted and that this question is quite old, but it can be useful for other people looking for other implementations.
This is a modified version of the code posted by: Angel.King.47 on July 27,09:
$text = preg_replace(
array(
'/(^|\s|>)(www.[^<> \n\r]+)/iex',
'/(^|\s|>)([_A-Za-z0-9-]+(\\.[A-Za-z]{2,3})?\\.[A-Za-z]{2,4}\\/[^<> \n\r]+)/iex',
'/(?(?=<a[^>]*>.+<\/a>)(?:<a[^>]*>.+<\/a>)|([^="\']?)((?:https?):\/\/([^<> \n\r]+)))/iex'
),
array(
"stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a> \\3':'\\0'))",
"stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a> \\4':'\\0'))",
"stripslashes((strlen('\\2')>0?'\\1<a href=\"\\2\" target=\"_blank\">\\3</a> ':'\\0'))",
),
$text
);
Changes:
As "Søren Løvborg" has stated, this function does not escape the URLs. I tried his/her class but it just didn't work as I expected (If you don't trust your users, then try his/her code first).
Upvotes: 1
Reputation: 7994
Here is something i found that is tried and tested
function make_links_blank($text)
{
return preg_replace(
array(
'/(?(?=<a[^>]*>.+<\/a>)
(?:<a[^>]*>.+<\/a>)
|
([^="\']?)((?:https?|ftp|bf2|):\/\/[^<> \n\r]+)
)/iex',
'/<a([^>]*)target="?[^"\']+"?/i',
'/<a([^>]+)>/i',
'/(^|\s)(www.[^<> \n\r]+)/iex',
'/(([_A-Za-z0-9-]+)(\\.[_A-Za-z0-9-]+)*@([A-Za-z0-9-]+)
(\\.[A-Za-z0-9-]+)*)/iex'
),
array(
"stripslashes((strlen('\\2')>0?'\\1<a href=\"\\2\">\\2</a>\\3':'\\0'))",
'<a\\1',
'<a\\1 target="_blank">',
"stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\">\\2</a>\\3':'\\0'))",
"stripslashes((strlen('\\2')>0?'<a href=\"mailto:\\0\">\\0</a>':'\\0'))"
),
$text
);
}
It works for me. And it works for emails and URL's, Sorry to answer my own question. :(
But this one is the only that works
Here is the link where i found it : http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_21878567.html
Sry in advance for it being a experts-exchange.
Upvotes: 15
Reputation: 13009
this should get you email addresses:
$string = "bah bah [email protected] foo";
$match = preg_match('/[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+(?:\.[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+)*\@[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+(?:\.[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+)+/', $string, $array);
print_r($array);
// outputs:
Array
(
[0] => [email protected]
)
Upvotes: 1
Reputation: 99225
Something along the lines of :
<?php
if(preg_match('@^http://(.*)\s|$@g', $textarea_url, $matches)) {
echo '<a href=http://", $matches[1], '">', $matches[1], '</a>';
}
?>
Upvotes: -1
Reputation: 25781
This RegEx should match any link except for these new 3+ character toplevel domains...
{ \\b # Match the leading part (proto://hostname, or just hostname) ( # http://, or https:// leading part (https?)://[-\\w]+(\\.\\w[-\\w]*)+ | # or, try to find a hostname with more specific sub-expression (?i: [a-z0-9] (?:[-a-z0-9]*[a-z0-9])? \\. )+ # sub domains # Now ending .com, etc. For these, require lowercase (?-i: com\\b | edu\\b | biz\\b | gov\\b | in(?:t|fo)\\b # .int or .info | mil\\b | net\\b | org\\b | [a-z][a-z]\\.[a-z][a-z]\\b # two-letter country code ) ) # Allow an optional port number ( : \\d+ )? # The rest of the URL is optional, and begins with / ( / # The rest are heuristics for what seems to work well [^.!,?;"\\'()\[\]\{\}\s\x7F-\\xFF]* ( [.!,?]+ [^.!,?;"\\'()\\[\\]\{\\}\s\\x7F-\\xFF]+ )* )? }ix
It's not written by me, I'm not quite sure where I got it from, sorry that I can give no credit...
Upvotes: 1