ssougnez
ssougnez

Reputation: 5886

Replace text not between certain tag with regex

Say I have the following text:

"I want a pink banana for my dog"

And I have a list of word(s) with their definition. For example:

"pink banana": "This is a weird banane" "banana": "This is a fruit"

I would like to replace the matching words in my sentence with something like:

<span tooltip="whatever">word</span>

That I can do, but the issue is that in my example, the first words will be replaced correctly:

"I want a <span tooltip="whatever">pink banana</span> for my dog"

But the second words will create an unwanted behavior:

"I want a <span tooltip="whatever">pink <span tooltip="whatever">banana</span></span> for my dog"

Which produce two tooltips on the word banana, which I don't want. Basically, I'd like to modify the regex used to replace the words ("\b(WORD)\b") with a regex that only replaces the word if it's not inside a "<span tooltip="(.*)"></span>".

Is this possible?

EDIT

Here's the code I use to loop through the items and replace the word:

foreach (var glossaryItem in items)
{
    textNode.InnerHtml = Regex.Replace(textNode.InnerHtml, $@"\b({glossaryItem.Name})\b", $"<span tooltip=\"{glossaryItem.Definition}\">$1</span>", RegexOptions.IgnoreCase);
}

Upvotes: 1

Views: 250

Answers (1)

Chris
Chris

Reputation: 2304

What you could try is replacing your regex with a negative lookahead subexpression (?!...) (or something similar to suit your needs).

For example:

foreach (var glossaryItem in items)
{
    textNode.InnerHtml = Regex.Replace(textNode.InnerHtml, $@"\b(?<!"">)({glossaryItem.Name})(?!<\/span>)\b", $"<span tooltip=\"{glossaryItem.Definition}\">$1</span>", RegexOptions.IgnoreCase);
}

This will basically allow you to match the string, only if "> is currently not at the beginning of the match and </span> is currently not at the end of the match.

Upvotes: 2

Related Questions