Thabiso Mofokeng
Thabiso Mofokeng

Reputation: 739

Regular expression for replacing html ampersand with xml equivalent

How can I write a c# regular expression to use for replacing an ampersand (&) with consideration that the ampersand occurrence could be in & > © or AT&T

For example... Regex.Replace("(\S)&(\S![^;])", "$1&$2");

Upvotes: 0

Views: 4119

Answers (3)

Oskar Kjellin
Oskar Kjellin

Reputation: 21900

I would propose that you just HtmlEncode it:

System.Web.HttpUtility.HtmlEncode()

Upvotes: 1

bartosz.lipinski
bartosz.lipinski

Reputation: 2677

You could use regex like: "&(?!(amp)|(lt)|(apos)|(gt)|(quot);)

Upvotes: 3

Richard
Richard

Reputation: 109120

To directly replace "&" with "& you don't need a regex, just use String.Replace (or StringBuilder.Replace).

However to replace "&" where it isn't followed by a "amp;" does need a regex, and a "Zero-width negative lookahead assertion":1

var result = Regex.Replace(input, "&(?!amp;)", "&");

1 The reason to use a zero-width assertion is to handle "&" at the end of the string.

Upvotes: 6

Related Questions