King Julien
King Julien

Reputation: 11318

How to convert URLs containing Unicode characters into clickable links?

I use this function to make URLs to clickable links but the problem is that when there is some Unicode character in the URL it becomes clickable links only before that character...

Function:

function clickable($text) {
    $text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
                          '<a class="und" href="\\1">\\1</a>', $text);
    $text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
                          '\\1<a href="http://\\2">\\2</a>', $text);
    $text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})',
                          '<a href="mailto:\\1">\\1</a>', $text);

return $text;

}

How to fix this problem?

Upvotes: 1

Views: 699

Answers (2)

Tomasz Struczyński
Tomasz Struczyński

Reputation: 3303

First of all, don't use eregi_replace. I don't think it's possible to use it with unicode - and it's depreciated from php 5.3. Use preg_replace.

Then you can try something like that

preg_replace("/(https?|ftps?|mailto):\/\/([-\w\p{L}\.]+)+(:\d+)?(\/([\w\p{L}\/_\.#]*(\?\S+)?)?)?/u", '<a href="$0">$0</a>

EDIT - updated expression to include # character

Upvotes: 1

Mark Baker
Mark Baker

Reputation: 212442

Try using \p{L} instead of a-zA-Z and \p{Ll} instead of a-z

You can find details of unicode handling in regular expressions here

And get in the habit of using the preg functions rather than the deprecated ereg functions

Upvotes: 0

Related Questions