Reputation: 183
I want to replace all occurrences of #word
with an HTML link. I have written a preg_replace()
call for this:
$text = preg_replace('~#([\p{L}|\p{N}]+)~u', '<a href="/?aranan=$1">#$1</a>', $text);
The problem is, this regular expression also matches the html character codes like '
and therefore corrupts the output.
I need to exclude alphanumeric substrings which are preveded by &#
, but I do not know how to do that using regular expressions.
Upvotes: 1
Views: 124
Reputation: 47864
<a>
elements, then properly URL-encode the href
value and HTML-encode the printed link text.Code: (Demo)
$text = '#Test ' #039foo "#bär"';
echo preg_replace_callback(
'~&#\d+;(*SKIP)(*FAIL)|#([\pL\pN]+)~u',
fn($m) => sprintf(
'<a href="/?%s">#%s</a>',
http_build_query(['aranan' => $m[1]]),
htmlentities($m[1])
),
$text
);
Unrendered output:
<a href="/?aranan=Test">#Test</a> ' <a href="/?aranan=039foo">#039foo</a> "<a href="/?aranan=b%C3%A4r">#bär</a>"
Rendered HTML:
#Test ' #039foo "#bär"
Upvotes: 0
Reputation: 1295
You would need to add a [A-Za-z]
rule in your regular expression statement so that it only limits itself to letters and no numbers.
I will edit with an example later on.
Upvotes: -1
Reputation: 9560
use this online regular expression constructor. They have explanation for every flag you may want to use.. and you will see highlighted matches in example text.
and yes use [a-zA-Z]
Upvotes: 0
Reputation: 17817
'~(?<!&)#([\p{L}|\p{N}]+)~u'
That's a negative lookbehind assertion: http://www.php.net/manual/en/regexp.reference.assertions.php
Matches # only if not preceded by &
Upvotes: 2