Jonathan
Jonathan

Reputation: 6732

Preg_replace regex in PHP gives unexpected empty result

I am using a regex to replace all email addresses in a string with a nice <a> to make them clickable. This works perfect, except for the case when there are two words of a certain minimum length and a dash between them in front of the email address. Only then I get an empty string as result.

<?php

$search = '#(^|[ \n\r\t])(([a-z0-9\-_]+(\.?))+@([a-z0-9\-]+(\.?))+[a-z]{2,5})#si';
$replace = '\\1<a href="mailto:\\2">\\2</a>';

$string = "tttteeee-sssstttt [email protected]";
echo preg_replace($search, $replace, $string);
// Output: "" (empty)

$string = "te-st [email protected]";
echo preg_replace($search, $replace, $string);
// Output: "te-st <a href="mailto:[email protected]">[email protected]</a>" (as expected)

$string = "[email protected] tttteeee-sssstttt";
echo preg_replace($search, $replace, $string);
// Output: "<a href="mailto:[email protected]">[email protected]</a> tttteeee-sssstttt" (as expected)

?>

I have tried everything, but I really can't find the problem. A solution would be removing the first dash in the regex (before the @ sign), but that way email addresses with a dash before the @ wouldn't be highlighted.

Upvotes: 1

Views: 904

Answers (2)

Wrikken
Wrikken

Reputation: 70460

OK, minimum use case: #([a-z-]+\.?)+@#, which reaches the backtrack limit (use preg_last_error()), it cannot determine where to put things, as the \. is optional, determining whether to use the inside or the outside + is a lot of work. The default limit of pcre.backtrack_limit of 100000 does not work, setting it to 1000000 does.

To solve this, make it easier on the parser: the first (([a-z0-9\-_]+(\.?))+ should become: ([a-z0-9\-_]+(\.[a-z0-9\-_]+)*), which is a lot easier to solve internally. And as a bonus, instead of the accepted answer, this still doesn't allow consecutive dots.

Upvotes: 2

Paul
Paul

Reputation: 141839

Try using this for your search string instead:

$search = '#(^|\b)([A-Z0-9_\-.]+@[A-Z0-9_\-.]+\.[A-Z]{2,5})($|\b)#i';

Upvotes: 1

Related Questions