Reputation: 181
Content is
Hello World.
<a href="#" target=_blank>hello World</a>
How to replace the
in html code and keep the other
in the text.
Upvotes: 18
Views: 87389
Reputation: 1
//Funciona!!!!!!!!!!!!!!!
string a =UnHtml(text);
//---------------------------------------------------
private static readonly Regex _tags_ = new Regex(@"<[^>]+?>", RegexOptions.Multiline | RegexOptions.Compiled);
//add characters that are should not be removed to this regex
private static readonly Regex _notOkCharacter_ = new Regex(@"[^\w;&#@.:/\\?=|%!() -]", RegexOptions.Compiled);
public static String UnHtml(String html)
{
html = HttpUtility.UrlDecode(html);
html = HttpUtility.HtmlDecode(html);
html = RemoveTag(html, "<!--", "-->");
html = RemoveTag(html, "<script", "</script>");
html = RemoveTag(html, "<style", "</style>");
//replace matches of these regexes with space
html = _tags_.Replace(html, " ");
html = _notOkCharacter_.Replace(html, " ");
html = SingleSpacedTrim(html);
return html;
}
private static String RemoveTag(String html, String startTag, String endTag)
{
Boolean bAgain;
do
{
bAgain = false;
Int32 startTagPos = html.IndexOf(startTag, 0, StringComparison.CurrentCultureIgnoreCase);
if (startTagPos < 0)
continue;
Int32 endTagPos = html.IndexOf(endTag, startTagPos + 1, StringComparison.CurrentCultureIgnoreCase);
if (endTagPos <= startTagPos)
continue;
html = html.Remove(startTagPos, endTagPos - startTagPos + endTag.Length);
bAgain = true;
} while (bAgain);
return html;
}
private static String SingleSpacedTrim(String inString)
{
StringBuilder sb = new StringBuilder();
Boolean inBlanks = false;
foreach (Char c in inString)
{
switch (c)
{
case '\r':
case '\n':
case '\t':
case ' ':
if (!inBlanks)
{
inBlanks = true;
sb.Append(' ');
}
continue;
default:
inBlanks = false;
sb.Append(c);
break;
}
}
return sb.ToString().Trim();
}
https://newbedev.com/remove-html-tags-from-string-including-nbsp-in-c
Upvotes: -2
Reputation: 31
string A = HttpContext.Current.Server.HtmlDecode(Text);
string A = Text.Replace(" "," ");
string A = Text.Replace("&nbsp;", " ");
↑ &nbsp;
Upvotes: 3
Reputation: 83
just Replace   to string.Empty after Text Like Below..
xyz.Text.Replace(" ", string.Empty);
Upvotes: 1
Reputation: 231
For me the best is :
Imports System.Web
HttpUtility.HtmlDecode(codeHtml)
Upvotes: 21
Reputation: 81680
This will find you all those strips of the text containing  :
<[^>]+? [^<]+?>
Fropm here you can just do a simple string replaces with the space since Regex will give you the lcoation ofthe match in your text.
Upvotes: 1
Reputation: 777
It's simple
youString.Replace(" ", " ");
String class http://msdn.microsoft.com/en-us/library/system.string.aspx
Replace method http://msdn.microsoft.com/en-us/library/fk49wtc1.aspx
Upvotes: 5
Reputation: 336368
Can you try searching for
(?<=<[^>]*)
and replacing it with a single space?
This looks for
inside tags (preceded by a <
and possibly other characters except >
).
This is extremely brittle, though. For example, it will fail if you have <
/>
symbols in strings/attributes. Better avoid getting those
into the wrong locations in the first place.
Upvotes: 4