Reputation: 171
I have a string variable that contains the following html data:
<p>
<em><strong>This is some <span style="background-color: rgb(255, 255, 0);">rich </span>text. 3 < 5 is a valid statement. <br />
</strong></em></p>
I need to be able to strip out the html, but leave any less than or greater than signs in case the data contains mathematical equations (like the "3 < 5" portion of the string). I am not able to use 3rd party applications/tools due to some restrictions of our site, and would prefer to use anything that is in the .net framework version 3.5. I have tried the regular expressions that follow, but they do not handle the less than/ greater than symbols.
<[^>]*>
<[^>]+>
<(.|\n)*?>
\<[^\>]*\>
I have also tried the code on this link, but it also does not handle the less than / greater than symbols either.
Any suggestions are greatly appreciated.
Upvotes: 1
Views: 1596
Reputation: 25994
Replace all text matching this with ''
(<[^\s]+[^<>]*>)+
(I tested it on Rubular.com, but it should work for C# too.)
Apparently the code should be
RegexObj.Replace('<p> <em><strong>This is some <span style="background-color: rgb(255, 255, 0);">rich </span>text. 3 < 5 is a valid statement. <br /> </strong></em></p>', "")
Upvotes: 3