Mireille28
Mireille28

Reputation: 337

How to prevent accented characters from making a PHP regex fail?

The below call to preg_replace fails and returns an empty string when content contains accented characters such as à or ÿ.

preg_replace('@([^=][^"])(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.\%\+#-]*(\?\S+)?[^\.\s])?)?)@', '$1<a href="$2" target="_blank">$2</a>', $content);

I rewrote the regex in preg_replace this way, and it worked:

preg_replace('@([^=][^"])(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.\%\+#ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ-]*(\?\S+)?[^\.\s])?)?)@', '$1<a href="$2" target="_blank">$2</a>', $content);

How can I make it shorter?

Upvotes: 1

Views: 63

Answers (1)

Bob
Bob

Reputation: 36

Use unicode may solve your problem.

[\u00bf-\u00ff]

Upvotes: 1

Related Questions